Does design matter when visualizing Big Data? An empirical study to investigate the effect of visualization type and interaction use

The need for good visualization is increasing, as data volume and complexity expand. In order to work with high volumes of structured and unstructured data, visualizations, supporting the ability of humans to make perceptual inferences, are of the utmost importance. In this regard, a lot of interactive visualization techniques have been developed in recent years. However, little emphasis has been placed on the evaluation of their usability and, in particular, on design characteristics. This paper contributes to closing this research gap by measuring the effects of appropriate visualization use based on data and task characteristics. Further, we specifically test the feature of interaction as it has been said to be an essential component of Big Data visualizations but scarcely isolated as an independent variable in experimental research. Data collection for the large-scale quantitative experiment was done using crowdsourcing (Amazon Mechanical Turk). The results indicate that both, choosing an appropriate visualization based on task characteristics and using the feature of interaction, increase usability considerably.


Introduction
One of the main purposes of management accounting is to provide decision-makers with relevant information for an easy, accurate, fast and rational decision-making process (Appelbaum et al. 2017;Dilla et al. 2010;Ohlert and Weißenberger 2015;Perkhofer et al. 2019b). Being able to fulfill this fundamental task is becoming more and more difficult as market dynamics increase (Eisl et al. 2012). The consequence is most likely a distortion of current working practice. Management accountants need to expand their scope from historical data reporting to real-time data processing, from using only in-house data to the inclusion of external data sources, from traditional paper based to interactive and online based reporting, and to shift the focus from reporting the past to predicting the future (Appelbaum et al. 2017;Goes 2014). To achieve this shift, new tools and technical instruments such as algorithms, appropriate management accounting systems and reporting software (Pasch 2019) and especially interactive data visualization are necessary (Perkhofer et al. 2019b;Janvrin et al. 2014;Bačić and Fadlalla 2016;Ohlert and Weißenberger 2015).
Visualizing Big Data proves to be of great importance when problems (or tasks) are high in complexity (Hirsch et al. 2015;Perkhofer 2019), or not sufficiently welldefined for computers to handle algorithmically, meaning human involvement and transparency is required (e.g. in fraud detection) (Munzner 2014;Kehrer and Hauser 2013;Dilla and Raschke 2015;Keim et al. 2008). Visualizing data means organizing information by spatial location and supporting perceptual inferences (Perkhofer et al. 2019a). Perceptual inferences are comparatively easy for humans to draw, as their visual sense is superior (with respect to fast programmable algorithms) and data transformation to the essential stores in human memory is astoundingly fast (Ware 2012;Keim 2001;Sweller 2010). Visualization thereby enhances the ability of both, searching and recognizing, and thus significantly enhances sense-making capabilities (Munzner 2014;Bačić and Fadlalla 2016).
However, in order to optimally support the human perceptual system, an appropriate and easy-to-use visualization needs to be presented to the decision-maker (Munzner 2014;Pike et al. 2009;Vessey and Galletta 1991;Perkhofer 2019;Falschlunger et al. 2016a). Especially when data and tasks increase in complexity, as is the case with high-dimensional datasets, traditional business charts are no longer able to convey all information in one chart. Therefore newer forms of visualizations, also called Big Data visualizations, have to be taken into account (Grammel et al. 2010). Big Data visualizations are unique in their purpose and designed to deal with, and present, larger amounts and various forms of data types (Perkhofer et al. 2019b). Novel forms with the goal of presenting multidimensional data, often used in visual analytics (Liu et al. 2017;Bačić and Fadlalla 2016), range from parallel coordinates and polar coordinates plots, over sunburst-, Sankey-, and heatmap-visualizations, to scatterplot matrices and parallel coordinated views (Liu et al. 2017;Albo et al. 2016;Bertini et al. 2011;Claessen and van Wijk 2011;Lehmann et al. 2010).
By the use of Big Data visualizations, the management accountant is able to show the whole dataset within one comprehensive visualization and is therefore able to generate new insights that would otherwise stay uncovered. For gaining insight, however, the visual display alone is not enough. The user needs to be able to interact with the interface (Elmqvist et al. 2011). Interacting in this context means using filter or selection techniques, drilling down to analyze the next layer of a data dimension, or also interchanging data dimensions or value attributes (Perkhofer et al. 2019b;Heer and Shneiderman 2012). Only if the user is able to interactively work with the dataset and answer predefined questions or questions that arise during the process of analysis, Big Data visualizations can unfold their full potential and new correlations, trends, or clusters can be detected for further use (Perkhofer et al. 2019a, c).
Unlike conventional charts used in everyday life (e.g. line, pie, or bar charts), new visualizations require a close focus on design and interaction in order to be considered useful (Liu et al. 2017;Kehrer and Hauser 2013;Elmqvist et al. 2011;Pike et al. 2009;Bačić and Fadlalla 2016). Unfortunately, for both the design and use of new visualization options, and the design and use of interaction, limited empirical research is available (Isenberg et al. 2013;Perkhofer et al. 2019b). Users still have to go through cost-intensive and unsatisfying trial and error routines in order to identify best practice instead of being able to rely on empirical evidence ( van Wijk 2013). This led us to identify two concrete and pressing questions in current literature, addressed in this study: (1) Appropriate use of new visualization types: Depending on data-and taskcharacteristics, some visualization types are claimed to outperform others when it comes to optimal decision-support. However, these claims are mostly based on their developers opinion or on small scale user studies rather than on experimental research (Isenberg et al. 2013;Perkhofer 2019). As multiple options to visualize Big Data are available, we limit the scope of this study to identify visualizations for multidimensional data. This is due to the fact, that it is impossible for traditional forms to show more than three attributes or three dimensions at the same time within one visualization. This, we think, highlights the importance and need of Big Data visualizations and demonstrates their benefits. Further, as a starting point to investigate Big Data visualizations we choose four frequently cited and actively used visualization types (details please see Table 1), namely the sunburst visualization, the Sankey visualization, the parallel coordinates plot and the polar coordinates plot (Bertini et al. 2011;Keim 2002;Shneiderman 1996). We wanted to investigate if one particular visualization type can outperform the other based on the three tasks identify, compare, and summarize (classification based on (Brehmer and Munzner 2013) using two different perspectives on the dataset (multiple hierarchy-levels vs. multiple attribute comparisons). (2) Appropriate use of interaction techniques: Pike et al. claim that interaction "has not been isolated as an experimental variable" yet, therefore hindering direct causal interpretation on this highly discussed and frequently used visual analytics feature (Pike et al. 2009, p. 272). This is because most user studies concentrate on the visualization itself, while interaction is added as an integrated feature incorporated into the source code of the visual representation. Visualizations can be used and tested without interaction (as a static form), however, interaction does not work without the visualization itself (Kehrer and Hauser 2013). "Exactly what kind of degree of benefit is realized by allowing a user to interact with visual representation is still undetermined." (Pike et al. 2009, p. 272). Consequently, to answer this claim we isolate the effect of interaction and evaluate the difference between an almost static versus a highly interactive visualization.
Performance is measured by the three components of usability defined by ISO 9241 (efficiency, effectivity and satisfaction) as well as by one comprehensive sum-score for usability described and created by the authors. For data collection, we used the crowdsourcing platform Amazon Mechanical Turk resulting in a large sample size of N 2272. Results obtained by MTurk have been shown to be congruent with lab experiments in the context of visual analytics (Harrison et al. 2014), allowing us to believe it is an appropriate and reliable platform to test our selected visualization options. Statistical analysis was based on MANCOVA (for simultaneously assessing efficiency, effectivity and satisfaction) or respectively ANCOVA (for assessing the sum-score for usability).
Results indicate that the used visualization type and the degree of interaction have an influence on efficiency and satisfaction, while the task type primarily influences effectivity. More precisely, from a users' perspective, information retrieval and therefore a fast and accurate decision is encouraged when being confronted with Cartesian based rather than Polar based visualization types (the Sankey visualization or the parallel coordinates plot rather than the polar coordinates plot or the sunburst visualization) and if visualizations are made accessible in a highly interactive form. For users to make effective decisions, the underlying task needs to be supported by the visualization type. For example, the task type summarize is executed more effectively if the data presented in the visualization is already condensed by dimensions (e.g. the Sankey or the sunburst visualization) while the task type identify is easier to execute if each single data entry is presented as a single property within the visualization (e.g. the parallel coordinates plot). These results can be seen as general guidelines for Big Data visualization use in the context of managerial accounting, however, also specific information on the used visualization types are presented in this experimental study.
The remainder of this paper is structured as follows: First, the general purpose of visualizations, specific visualization types and interaction techniques are discussed and the hypotheses four our experimental design are presented. In Sect. 3, the study design is laid out in detail before analysis is presented in Sect. 4. The last sections discuss and conclude our research findings, state limitations, and propose opportunities for further research in the context of interactive visualizations for multidimensional datasets.

Theoretical background and hypotheses
The fundamental goal of visualizations is to generate a particular insight or to execute a specific task by emphasizing distinct features of the underlying dataset (Lurie and Mason 2007;Anderson et al. 2011). Insights can either be the discovery of trends, correlations, associations, clusters, and events (that allow the generation or the verification of hypotheses), or the presentation of information to a particular audience by telling a persuasive and data-supported story for decision-making purposes (Brehmer and Munzner 2013). While telling a story mostly follows a standardized procedure such as reporting, the generation or verification of a hypothesis, in contrast, is typically ad hoc and unstructured (Perkhofer et al. 2019b). Especially in situations that ask for an ad hoc evaluation of a highly complex and large dataset, the use of Big Data visualization is of great importance (Chen and Zhang 2014;Falschlunger et al. 2016a). Consequently, users confronted with such problems have already established the use of Big Data visualizations. Pioneering examples can be found in fraud detection (Singh and Best 2019;Keim 2001), or when analyzing network traffic (Keim et al. 2006) as well as business models to reduce costs but maintain quality (Appelbaum et al. 2017). Further, Big Data visualizations are also customary in companies with a high focus on personalized marketing and social media to evaluate the implications of certain initiatives on product satisfaction and innovation (Keim 2001;Appelbaum et al. 2017).
Novel and interactive visualization types, such as those mentioned in the introduction (if used for their intended purpose and designed optimally), allow information to be uncovered which would otherwise stay hidden (Grammel et al. 2010). Currently, these new insights can be seen as a way to better attract customers or to optimize maintenance (Appelbaum et al. 2017), however, in the near future generating insight form Big Data will be a necessity to stay competitive (Perkhofer et al. 2019b). Nonetheless, in order to generate insight, the users and their abilities as well as needs have to be considered in the process of selection and design (Perkhofer 2019;Endert et al. 2014). While targeting specific users or user groups has already been identified as an essential part in standardized or scientific visualization use (Yigitbasioglu and Velcu 2012;Speier 2006), unfortunately, researchers and developers working on Big Data visualizations still put their sole focus on the generation of new visualization options to present a holistic view on the dataset (Perkhofer 2019). In doing so, they often fail to consider the users' precise needs and risk for their visualizations to misinform users or for not being used at all (Isenberg et al. 2013;van Wijk 2005;Perkhofer et al. 2019b).
In order to create or select appropriate visualizations, three stages are crucial according to Brehmer and Munzner: (1) encode (select and design appropriate visual forms), (2) manipulate (enable the user to interact with the data), and (3) introduce (enable the user to add additional data and save results) (Brehmer and Munzner 2013). In this paper, we concentrate on the selection and the design of appropriate, and most importantly interactive visualizations and therefore put emphasis only on the first two stages.

Encode: choosing the visual representation and design
Visual representation is synonymous with visual encoding and means transforming data into pictures. Analyzing data through a visual inference is easier and cognitively less demanding than looking at raw data because it allows for the identification of proximity, similarity, continuity, and closure (Zhou and Feiner 1998;Ohlert and Weißenberger 2015). Before we evaluate Big Data visualizations and their influencing factors based on usability (ISO 9241), a classification on the multiple choices proposed and presented to potential users (in literature and free libraries such as D3.js 1 ) is necessary. We limit our investigation to frequently-cited and open-sourced visualization options (Perkhofer et al. 2019a, b) as the evaluation of all Big Data visualization options goes beyond the scope of this paper.

Classification and description of frequently used Big Data visualizations
For classification, we distinguish between two features: the type of data that can be represented in the proposed visualization (1a. multiple dimensions but only one attribute 2 → hierarchical visualization vs. 1b. multiple attributes but only one dimension → multi-attribute visualization) and the basic layout (2a. Polar or 2b. Cartesian-coordinates based visualizations). A summary on the identified visualizations used, is presented in Table 1 (Please note that the table does not claim to be exhaustive, but should rather be seen as an indicator of frequently-used or proposed visualization methods for multidimensional datasets, which is the selection criteria for our empirical analysis).
Based on this summary of highly cited and used visualization types we can conclude that both, a mix of Polar and Cartesian-coordinates based visualizations as well as a mix of hierarchical and multi-attribute based ones, are common. From this pool of options, we picked the most frequently cited pair of each category for comparison. For a better understanding of each individual visualization type, they are going to be explained in more detail in the following: The sunburst visualization (Polar-coordinates based layout and hierarchical data structure): The sunburst visualization is one of the more frequently used visualization types compared to other and newer forms of visualizations (Perkhofer et al. 2019b). It projects the multiple dimensions of the dataset in a hierarchical dependent manner into rings and can therefore be mapped to be a Polar-coordinates based visualization option. The sunburst is a compact and space-filling presentation and shows the respective proportion of the total value by each dimensions and its sub-components by its angular size (Rodden 2014). Due to the strict structure of a sunburst, the innermost ring represents the highest hierarchical level and all dimensions dependent on it are represented in further rings to the outside (Keim et al. 2006). The position of the rings influences interpretation and therefore re-positioning of these dimensions (using another sequence of dimensions for the display of the rings) means gaining other and new valuable insights. Additionally, based on the Vega-light specification, categorical color scales are used to encode discrete data values, each representing a distinct category and sequential single-hue schemes to visually link related data characteristics (Satyanarayan et al. 2017).
The Sankey visualization (Cartesian-coordinates based layout and hierarchical data structure): The Sankey chart focuses on sequences of data, which can either be time-related or dependent on a hierarchical structure (Hofer et al. 2018). It is often used    (2015) Multi-attribute Cartesian-coordinates based visualization a Also known as star plot or radar chart to present analysis based on elections, as it allows to highlight populations remaining loyal to the same party as well as populations changing their vote from one election to the other. Thus, the Sankey visualization is designed to present information of flow between multiple dimensions (e.g. processes, entities,…) (Chou et al. 2016). With regard to storytelling and sensemaking, interactions like re-ordering (changing the sequence of dimensions) and reducing the amount of visible nodes to minimize visual clutter are indispensable (Chou et al. 2016). In addition, for a consistent analysis of the data, it is necessary to find a way to highlight information across nodes by making use of selectors (Hofer et al. 2018).
The parallel coordinates plot (Cartesian-coordinates based layout and multiattribute data structure): The parallel coordinates plot is a very popular and strongly recommended visualization in the InfoVis (Information Visualization) community and highly cited in scientific research. This is due to the fact, that the parallel coordinates plot is one of the few visualizations that is able to present multiple attributes in one chart (Hofer et al. 2018;Perkhofer et al. 2019a). Two or more horizontal dimension axes are connected via polygonal lines at the height of the respective dimension value (Keim 2002;Perkhofer et al. 2019c). To do so, data is geometrically transformed (Keim 2001) and each line represents one data entry (e.g. an order, a sales entry). With respect to interpretation, Inselberg introduced common rules for the identification of correlations and trends (Inselberg and Dimsdale 1990) • lines, which are parallel to each other suggest a positive correlation, • lines crossing in an X-shape suggest a negative correlation, and • lines crossing randomly, show no particular relationship.
Similar to a Sankey visualization, a user has to be able to re-arrange the axes on demand as only neighboring axis can be interpreted in a meaningful way (Perkhofer et al. 2019a). By making use of both, categorical/sequential single-hue color scales as well as filtering options, cluster analysis can be performed (Perkhofer et al. 2019c).
The polar coordinates plot (Polar-coordinates based layout and multi-attribute data structure): The polar coordinates plot is a radial projection of a parallel coordinates plot (Diehl et al. 2010). Attributes are arranged radially, and each attribute value is presented proportionally to the magnitude of the value of each attribute with respect to their minimum and maximum value. Each line connecting the attribute values represents one data entry. Characteristic for a polar coordinates plot is the detection of dissimilarities and outliers. Nonetheless, it is difficult to compare the lengths across the uncommon axes (Diehl et al. 2010). Users encoding a polar coordinates plot try to interpret the area that appears as soon as all attributes are connected. Unfortunately, areas that appear at random depending on the loosely selected order of attributes, misinform the user. Further, areas are more difficult to compare than straight lines connecting data points (Kim and Draper 2014) and data points in outer layers cause areas to appear disproportionately bigger and therefore, angles lead to a harder assessment than straight lines due to their distortion (Perkhofer et al. 2019a). Effects have not been tested yet (Albo et al. 2016).

Possible factors influencing usability of Big Data visualizations
After presenting the most frequently used visualization options for Big Data, we are going to discuss possible influencing factors on their ability to encode specific information and making them accessible to their users. As already explained, each visualization type has the potential to uncover and present a different type of insight to its audience (supporting a different task), while at the same time hiding another (Perkhofer 2019). As theories and experimental research on the process of encoding for interactive visualizations for Big Data are limited, research from standard business graphics and dashboarding are used for hypotheses development (Falschlunger et al. 2016a;Speier 2006;Vessey and Galletta 1991;Perkhofer 2019;Yigitbasioglu and Velcu 2012). The purpose of this approach is to test existing principles on their applicability on interactive and new forms of visualizations and to shed light on the process of encoding in order to foster decision-making.
Previous findings have shown that the following factors (explained in more detail below) influence the ability of the user to successfully decode information given a chosen visualization option (Perkhofer 2019;Falschlunger et al. 2016a, b;Ware 2012;Vessey and Galletta 1991;Speier 2006): (1) the design of the visualization, (2) the dataset, (3) the task, and (4) the decision-maker characteristics (in particular previous experience and knowledge on reading and interpreting visualizations).
With respect to the design of the visualization, it has been shown that a low data-ink ratio (Tufte 1983) and the display of coherent information in juxtaposition (Perkhofer 2019) allows for a faster processing of information. Both of these principles are satisfied by Big Data visualizations, as they are designed to visualize the full dataset within one coherent visualization. However, a need for discussion can be identified when choosing a basic layout as visualizations are either based on a polarcoordinate or Cartesian-coordinate system and the basic layout fundamentally changes the way information needs to be decoded by the user (Rodden 2014). While in a polarcoordinates based visualizations, angles need to be assessed, the height of a column or the length of a line that needs to be compared within a Coordinates-based system. With respect to standardized business charts, Cartesian-coordinates based visualizations (scatterplots, line and bar charts) are known to outperform polar-coordinates ones (pie charts) (Diehl et al. 2010). However, this result on the most appropriate layout needs to be re-evaluated for Big Data visualizations as interactivity might change results (Albo et al. 2016). Further, the share of polar-based visualizations within the available and applied visualization tools is quite large and therefore deserves a closer look. This leads to our first hypotheses: H1a: The basic layout influences usability of a visualization. H1b: Cartesian-coordinate based visualization types outperform polar-coordinate based visualization types.
Next to the design, the underlying dataset influences usability. It is known, that data can only be assessed, as long as enough cognitive space is available for data processing (Sweller 2010;Atkinson and Shiffring 1968;Miller 1956). Otherwise, or more precisely in a state of information overload, a negative effect on effectivity, efficiency, and satisfaction can be identified (Bawden and Robinson 2009;Falschlunger et al. 2016a). It has also been demonstrated that data, which is presented in a familiar form (e.g. known since childhood) or which can be related to already known information stored in long-term memory, is processed faster and more accurately as the burden it poses on working memory is reduced (Perkhofer and Lehner 2019; Atkinson and Shiffring 1968).
As presented in Table 1, one needs to distinguish between hierarchical and multiattribute visualization types when dealing with multidimensional datasets. While for hierarchical visualizations only one attribute (e.g. one KPI such as sales) needs to be evaluated based on different levels and compositions of aggregation, for multiattribute visualizations multiple attributes need to be processed. For the latter, not only different measures need to be known and understood, but also they have to be analyzed in reference to each other for new insights to appear. Consequently, multiattribute visualizations are said to enhance the burden placed on the user (Falschlunger et al. 2016a) leading to the following hypotheses for our investigation: H2a: The underlying dataset influences the usability of a visualization. H2b: Hierarchy based visualizations types outperform multi-attribute based visualization types. Without question and as already mentioned multiple times, tasks and insights differ with different visualization types. Matching the visualization to its respective task has been identified as a main influence in traditional visualization use. It has been shown that a mismatch increases cognitive load and consequently impairs decision-making outcome (Falschlunger et al. 2016a, b;Dilla et al. 2010;Shaft and Vessey 2006;Speier 2006;Perkhofer 2019). Up to now, the question of tables versus charts has been extensively tested resulting in a classification of spatial tasks (looking for trends, comparisons etc.) to be best supported by spatial visualizations (charts) and symbolic tasks (looking for specific data points) to be best supported by symbolic visualizations (tables) (Vessey and Galletta 1991). With respect to Big Data visualizations, a new classification of tasks has been established, namely identify (search for a specific data point), compare (compare two different data points or also two different aggregation levels), and summarize (generate overall insights by looking at the whole dataset) (Brehmer and Munzner 2013). However, these tasks have not yet been associated with visualization types or characteristics of visualization types.
Based on the fundamental activities that users have to perform and given the above presented task type classification, we hypothesize that the task identify is easier to perform if no previous aggregation based on different dimensions has influenced the visual appearance of the dataset. On the other hand, summarize should be easier to accomplish, if the dataset has already (at least to some extent) been aggregated and not every single data point is displayed in isolation. With respect to compare, results will be better for single data comparison tasks if the display shows every data-point in isolation, while results on the comparison of sub-dimensions (already aggregated data) will be better in hierarchical visualizations.

H3a:
The task type influences the usability of a visualization. H3b: Users will perform better with a multi-attribute visualization than with a hierarchy-based visualization when confronted with the task type identify. H3c: Users will perform better with a hierarchy-based visualization than with a multi-attribute visualization when confronted with the task type summarize.
And finally, yet important, the factor user characteristics needs to be considered when choosing a specific visualization type. Results on standard business charts have demonstrated that choosing the right chart type (bar, line, pie, or column) has resulted in contradicting results given different user groups (Falschlunger et al. 2016a). Only if the factor previous experience, not only with the dataset and the KPIs but also with the respective visualization type used, is considered and included into the selection process, satisfying results are the consequence (Falschlunger et al. 2016b;Perkhofer and Lehner 2019). This can again be explained by the use of cognitive load theory: If the user has never seen the layout before a lot of cognitive resources are needed to read the visualization rather than to interpret the information that is represented by it. The more experience a user has with a specific visualization the more reading strategies exist and the more automated is the process of extracting data from the visualization leaving ample room for data interpretation (Perkhofer 2019).
H4: Previous experience/usage of the different visualization types positively effects usability.
Based on these findings, the different visualization options for representing multidimensional data presented and described above will most likely result in differences on usability depending on the task (identify, compare, summarize), the dataset used (hierarchical or multi-attribute data), their basic layout they represent (Cartesian-based or polar-based layout), and the level of previous experience. Nonetheless, we hope to find general rules and guidelines, similar to those of standard and scientific visualization use, to guide designers and users.

Manipulate: using interaction to manipulate existing elements
Visualizations designed to present large amounts of data greatly benefit from the process of interaction. In particular, the following processes are better supported by interactive visualizations when confronted with visual analytics tasks: detecting the expected, discovering the unexpected (generate hypotheses), and drawing datasupported conclusions (reject or verify hypothesis) (Kehrer and Hauser 2013). To be more specific, working with interactive visualizations is driven by a particular analytics task. However, when working interactively, analysis does not end by finding a proper answer to the initial task, but rather allows the generation and verification of additional and different hypotheses, which are then called insight (Pike et al. 2009). These are generated only because the user interactively works with the dataset, and the process of doing so increases engagement, opportunity, and creativity (Brehmer and Munzner 2013;Dilla et al. 2010).
As a consequence, visualizations presented in Table 1 are claimed to only become useful as soon as the user is able to interact with the data. Interaction is of such high importance, because the actions of a user demonstrate the "discourse the user has with his or her information, prior knowledge, colleagues and environment" (Pike et al. 2009, p. 273). Further, the sequence of actions is not predefined but rather individual and dependent on the user. It thereby particularly supports the user's knowledge base and perceptual abilities (Dilla et al. 2010;Brehmer and Munzner 2013;Elmqvist et al. 2011;Dörk et al. 2008;Liu et al. 2017). Consequently, interaction requires active physical and mental engagement and throughout this process, understanding is increased and decision-making capabilities are enhanced (Pike et al. 2009;Pretorius and van Wijk 2005;van Wijk 2005;Wilkinson 2005;Dix and Ellis 1998;Buja et al. 1996;Shneiderman 1996). In a static form, only a general overview is presented to the user, however, without the opportunity of interaction, hypotheses verification, or further hypotheses generation is extremely limited (Hofer et al. 2018;Liu et al. 2017;Pike et al. 2009;Perkhofer et al. 2019b). This not only frustrates users, but also contradicts the well-known mantra of visual information seeking: "overview first, zoom and filter, then details on demand" (Shneiderman 1996). On a more practical level, interaction allows the user to filter, select, navigate, arrange, or change either the amount of data or the characteristics of the visual display (for details on the interaction techniques see Table 20 in the "Appendix").
Results on the proper use of interaction techniques are limited. Unfortunately, studies that have been conducted up to now, tend to blur the concept of visualization in combination with interaction (Pike et al. 2009). However, existing recommendations predominantly support the use of multiple interaction methods (Rodden 2014). "The more ways a user can 'hold' their data (by changing their form or exploring them from different angles and via different transformations), the more insight will accumulate" (Pike et al. 2009, p. 264). By intentionally clicking, scrolling and filtering the data, the user gains a deeper understanding of the relations within the given dataset. Interaction is therefore an essential part of the sense-making process and enhances the user's processing and sense-making capabilities (Shneiderman 1996). Building on previous literature, the following hypotheses are presented: H5a: Interaction influences the usability of a visualization. H5b: Users will perform better with a highly interactive visualization than with a mostly static one.  Fig. 1 Research model a within and between experimental design (4 × 3 × 2) was used. The visualization type was manipulated at four levels: For the experiment, we chose the most frequently researched and available visualization types, the sunburst visualization, the Sankey visualization, the parallel coordinates plot and the polar coordinates plot (please see Table 1). Two of the visualization types under investigation are in a Polar-based layout and two in a Cartesian-based one. Further, two out of four show a hierarchical dataset while the other two present a multi-attribute one. The task type was manipulated at three levels: The statements were based on Brehmer and Munzner's task taxonomy-identify, compare and summarize (Brehmer and Munzner 2013). And finally, interaction was manipulated at two levels (limited interaction, high interaction).
The experimental study was conducted using LimeSurvey and the crowdsourcing platform Amazon Mechanical Turk (MTurk). For each visualization type, a separate but identical experiment was created. Each participant evaluated only one visualization type, but had to assess various statements to simulate the process of hypothesis verification ( task types). Visualizations were coded based on the D3.js library, extended, and adjusted to fit our purpose (most significant changes needed to be implemented with respect to interaction techniques; available visualizations had limited options). Visualizations are available for download on the author's homepage or they can be accessed by clicking on the link presented in Table 22. Specifications on the dataset, the tasks, and the visualizations used are presented in the following subsections and the research model is presented in the following Fig. 1.

Data sample
We used a self-generated data sample for our study as a basis to compare the different visualization types. The dataset simulated a wine trade company and consisted of 9961 records, whereby each record represented a customer's order. During construction of the sample, six finance experts designed key metrics typically used in trade companies to simulate a close-to-reality example for data exploration. The dataset consisted of 14 dimensions (order number, trader, grape variety, winemaker, state, etc.) and 12 attributes (gross margin, net margin, gross sales, net sales, discounts, gross profit, shipping costs, etc.) in total. As a result, our dataset can be described as being structured and shows no inconsistencies or missing values. Users were confronted with a large amount of data shown within one visualization, including multiple possible dimensions and attributes in order to find patterns, trends, or outliers. This allowed the assumption that confusion and misunderstanding based on the dataset were kept to a minimum (also confirmed in pre-tests). Each visualization used, without any filters active, showed the complete underlying dataset of 9961 records.

Manipulation of the independent variables
As already explained in Sect. 2, we tested four distinct visualization types. These four visualization types could be characterized by two features: by the structure of the data they were capable to display (hierarchical data vs. multi-attribute data) and by the overall layout of the visualization types (horizontal/Cartesian vs. radial/Polar). Additionally, interaction is the central component to understand and work with Big Data visualization tools. Therefore, by taking a closer look at already existing prototypes and literature, two interaction concepts per visualization type were designed to establish comparison and fairness, but also present each type in the best possible and most natural way. The used and virtually available visualization types and their respective interaction concepts are presented in Tables 21 and 22 in the "Appendix".
Based on the previously described dataset, statements in accordance with Brehmer and Munzner's task classification model for Big Data visualizations were created and presented to the participants for evaluation in randomized order. Participants were asked in the experimental conditions to assess the statements' truth (examples are presented in Table 2). Each task type was assessed twice per visualization-interaction combination.

Dependent variable usability
For assessing the quality of a visualization, the effects on user performance (efficiency and effectiveness) alone are not sufficient, instead one needs to measure the whole concept of usability (Pike et al. 2009; van Wijk 2013). Usability is defined by ISO 9241-11 and represents a combination of effectiveness, efficiency, and satisfaction (SAT) in order to present the user with the best possible solution. For effectiveness, we rated the number of statements answered correctly, while for efficiency, we measured the time for task execution (logged by LimeSurvey as soon as answers to a given task  were submitted). With respect to satisfaction, we collected data not per single task, but for each visualization and interaction level. Participants had to rate their satisfaction on a 5-point Likert scale (Question: Please rate your overall level of satisfaction with the visualization in the figure presented below. Please bear in mind the experimental tasks when filling out the scale. Answer options: very satisfied-satisfied-neutral-unsatisfied-very unsatisfied).
As usability is measured inconsistently throughout the literature, we provide insights into all three sub-components but also introduce one comprehensive measure for usability for better readability of results. In order to do so, we first calculate z-scores for the components before adjusting the absolute distance between min and max (the largest distance exists between efficiency; this distance is used to re-calculate min and max for effectivity and satisfaction). After data transformation, a sum score is calculated. Figure 2 documents the calculation and Fig. 3 shows details on the distribution of the used variables.

Control variable
Usage/Experience was assessed by analyzing each visualization type based on a 5point Likert scale (1-no use to 5-daily use) (Question: How often do you use the   Table 3.

Procedure
Before starting the experiment, an introduction served to explain the given dataset and the procedure of the study. Further, in order to work effectively and efficiently with the visualizations and the various interaction techniques used, we included a short video showing the visualization type in detail as well as all possible interaction techniques (the link was also available throughout the experiment by a link posted in the help section). For each visualization type and each interaction stage, not only the video but also a verbal explanation was included throughout the experiment. After reading the introduction, the attention of the participants was tested to ensure quality by asking control questions. Only if 4 out of 6 questions were answered correctly, data was used for analysis. The static and the interactive layout were grouped and presented to the participants in randomized order (either they started with all questions based on the static or with all questions based on the interactive layout). Tasks within each layout were again presented in randomized order. After completing the experimental task, participants filled out a preference questionnaire concerning interaction and provided information on their experience with visual analytics and with the visualization type under investigation. Additionally, demographic information was collected.
The studies (one per visualization type) were launched on Amazon Mechanical Turk in June 2018 (Sunburst and Sankey visualization) and in January 2019 (Parallel Coordinates and Polar Coordinates Plot). Participants were compensated for their participation (10$ per participant, the study lasted approximately 45 min). Participants were only compensated for their invested time when the complete dataset was handed in and no response pattern (e.g. always choosing the same answer option, choosing "no answer" more than 30% of the time) could be identified. For data analysis, we excluded the max 5% and min 5% of time needed per visualization to eliminate outliers. This is motivated by the researchers' previous experience in lab settings: We could observe extremely low task times in cases when not enough effort was put into solving the tasks (also leading to poor effectivity), and extremely high task times in cases of distractions. These steps were necessary to ensure high-quality data, even without observing participants during the execution of the experimental tasks.

Participants
Current evaluation practice for experimental research in the field of Information Visualization is to recruit undergraduate or graduate students to participate in a lab experiment. However, depending on their degree, students may have very little experience in visual analytics and thereby might cause misleading results ( van Wijk 2013). Hence, we decided to recruit participants via Amazon Mechanical Turk, with the requirement of having at least the US Bachelor's degree as educational level, we introduce the topic and ask for knowledgeable participants in the survey description and, additionally we check for working experience and experience in visual analytics. Both, results on actual visualization use (see Table 3) and experience in visual analytics (see Table 4) show that participants were knowledgeable and therefore the right target group for our study. In total, we recruited 198 participants resulting in 2376 evaluable task assessments. Details on the participants per study can be found in Table 4.

Results
In the following, first results for each visualization type are going to be presented. This initial analysis shows whether interaction is a necessary component of a Big Data visualization (Sect. 4.1). For evaluation, MANCOVA is used to analyze our dependent variables individually (effectivity, efficiency, and satisfaction) and ANCOVA to analyze the generated sum score of usability. To check the quality of our results, we also conducted randomization tests to see whether a random allocation of results shows a difference in outcome to one of the variable's specification (number of resamples: 200). All randomization tests showed satisfying results, which are presented in the "Appendix". In the second part of this analysis, the effect of task type is going to be analyzed in more detail (Sect. 4.2). Table 5 shows descriptive statistics for the variables interaction technique and visualization type, which are the subjects of the first MANCOVA. While interaction seems to have a limited effect on effectivity, it seems to have a positive effect on both, efficiency and satisfaction. With respect to the visualization types, unfavorable results from the polar coordinates plot for the two variables efficiency and satisfaction stand out. Further, measures on effectivity and satisfaction show excess kurtosis as well as skewness of around − 1 indicating that the measure is a little steeper and somewhat skewed to the left when compared to normal distribution. Response time shows a higher deviation from normality (skewness: − 1.7; kurtosis: 4.0), as is typical for time related experiments with no time constraints imposed on the users. Based on these results and by taking a closer look at the visualization types used, we can find an initial support for our hypothesis 1a and 2a that both, the layout as well as the dataset influence usability. While the layout seems to have a stronger influence on task time, the dataset has a stronger influence on response accuracy. With respect to hypothesis 1b, we can support the claim that Cartesian-based visualizations show a higher usability especially when looking at the interactive form. While for hypothesis 2b, we find only partial support as efficiency and satisfaction show a better performance for hierarchy-based visualizations while task accuracy is higher for multiattribute ones. Further, we find initial proof for hypothesis 5a as well as 5b stating that interaction indeed has an influence and that usability is higher for interactive than for static visualizations. The next step is to test if these findings show significance in our multivariate linear model. We measure experience with visual analytics by asking the question: "Please indicate your experience with visual analytics"; Answer options: "No experience at all-extremely experienced"

MANCOVA for evaluating effectivity, efficiency, and satisfaction
We decided to use MANCOVA for analysis despite Box's Test of equality of covariance matrices is significant (p < 0.001), as first N is high (approx. 500 per visualization type and 1000 per interaction concept-static vs. interactive) and therefore the test might be too sensitive to violations of equality. And second, because groups are roughly equal in size (between 250 and 270). As a consequence, we report Pillai's Trace and Wilks' Lambda as those two are the more conservative measures to minimize type I error. Table 6 shows that the use of interaction has the strongest effect size in this model with a partial eta squared (ï 2 ) of 0.285. Based on Cohen's classification (1969), this effect can be seen as strong (ï 2 ≥ 0.138), while the effect of visualization type is 0.091 and therefore considered to be medium in size (ï 2 ≥ 0.059). Usage (or previous experience with the particular visualization type) as well as the statistical interaction effect between visualization type and interaction technique (VisType x Interaction) can be interpreted as small effects (ï 2 ≥ 0.010). After this initial analysis of the independent variables and the covariate, a more detailed analysis based on the three dependent variables follows in the Table 7. Table 7 shows that none of the independent variables under investigation has a significant effect on effectivity (response accuracy-RA). However, choosing the right visualization type has an effect on efficiency (response time-RT) and satisfaction. Further, using Big Data visualizations in an interactive form also show an effect on efficiency and satisfaction. Our introduced covariate usage (or previous experience with the particular visualization type) influences only the variable satisfaction, no effects can be found on efficiency and effectivity. These results allow us to confirm hypothesis 4 as well as 5a.
Drilling further down in our analysis, results on post hoc Sidak indicate that for all visualization types the interactive form shows superior results for response time and  that the polar coordinates plot performs significantly worse than all other visualization types for Big Data. Regarding satisfaction, we can identify a superior visualization type (Sankey) as well as an inferior visualization type (polar coordinates). Further, we can identify again that interactive visualization types satisfy participants while static ones seem to frustrate them. (Note: Analysis based on effectivity is not presented, as no significant results could be obtained between-subjects). The detailed analysis in Table 8 allows us to confirm hypothesis 5b (based on efficiency and satisfaction; no result on effectivity).
Our analysis in Table 9 shows that Cartesian-coordinate based visualizations outperform Polar-coordinate based visualizations in terms of efficiency and satisfaction, while the difference in effectivity is not significant (confirming hypothesis 1b). Tak- ing a closer look at the dataset used, we can find partial support for our hypothesis 2b. Hierarchy based visualizations perform better in terms of efficiency and satisfaction, while they perform worse (but only at a p < 0.1 significance level) in terms of effectivity.

ANCOVA for evaluating our sum score for usability
After the initial analysis based on MANCOVA, where each dependent variable was looked at independently, we now investigate how a single score for usability influences results and interpretation. The following table presents all independent variables and the used covariate for usage (Table 10). First, interpretation is easier, as only one dependent variable needs to be considered during analysis. However, statistical power and explainability is reduced considerably. The previously observed strong effects are reduced to be of medium strength and r 2 is only 0.147, indicating that only 15% of the variability in the dependent variable can be explained by our sum score for usability. Nonetheless, interpretation based on pairwise comparison stays the same. We can derive that interactive visualizations are superior when compared to static ones for all visualization types tested and we can conclude that the polar coordinates plot is inferior when compared to the other visualization types used in this study (Tables 11,12).

Results on task type (per interactive visualization type)
From our analysis in Sect. 4.1, we know that using Big Data visualization in an interactive form results in a better performance with respect to the two dependent variables efficiency and satisfaction which in turn increases usability. As a result, analysis based on task type is carried out only for interactive visualization types in order to concentrate on identifying the best visualization for specific task types. Again, we separately look at the dependent variables in a multivariate general linear model (Sect. 4.2.1) and use a univariate general linear model to evaluate the sum score for usability (Sect. 4.2.2). Table 13 shows mean values for the obtained data. What we can see from this analysis is, that the visualizations showing multiple attributes seem to outperform hierarchybased visualizations when working with the task type identify (when looking at effectivity), while for summarize no clear indication on superiority can be found. Regarding the distribution of the variables, we can again see a rather strong deviation Based on these initial results, we can derive that the task type indeed has an influence on the variables used, however, the strongest influence can be identified for effectivity. Further, it seems that our hypothesis 3b can be supported, while there might be little empirical support for 3c. Nonetheless, for a final résumé, statistical analysis from the next chapter is necessary.

MANCOVA for evaluating effectivity, efficiency and satisfaction
Again, Box's Test of equality of covariance matrices is significant (p < 0.001) but also for this analysis N is high (approx. 250 per visualization type and 350 per task type--identify, compare, and summarize) and groups are roughly equal in size (between and 84 and 92). Consequently, we report Pillai's Trace and Wilks' Lambda as those two are the more conservative measures. Table 14 indicates that the visualization type has an influence on our multivariate general linear model and based on the effect size represented by ï 2 , the effect can be classified to be of medium size. Also, the covariate usage shows a significant effect and stands for an increased satisfaction along with an increase in previous experience with the respective visualization type. An effect is also visible for the statistical interaction of task type and visualization type (VisType x TaskType), while no significance is shown for task type. Detailed analysis based on the dependent variables is presented in Table 10.
The more detailed analysis in Table 15 reveals that (as already known by the analysis in Sect. 4.1) the visualization type influences efficiency as well as satisfaction, while it has no effect on effectivity. More interestingly, this analysis also shows that task type influences effectivity (supporting H3a) and that there is a significant statistical interaction effect between visualization type and task type used (VisType x TaskType).   This interaction effect states that depending on the visualization type used the difference in mean varies and for further insight pairwise comparison is needed to explain the connection (Table 16). Response accuracy, without distinguishing between the visualization types, is significantly higher for the task type summarize than it is for the task type identify. Taking a closer look at the task type identify, it becomes obvious that the visualization types that show disaggregated data (more attributes and only one dimension: parallel coordinates and polar coordinates) are superior, while visualization types that aggregate data based on multiple dimensions are inferior (sunburst and Sankey). The opposite is true for the task type compare (although differences are only significant at a p < 0.1 level) and for the task type summarize no significant difference between visualization types can be found. Based on these results hypothesis 3b can be supported, while 3c cannot.
With respect to response time and satisfaction, the same results as already presented in Sect. 4.1 are visible. Overall, polar coordinates are inferior to the other three visualization types. No significant difference can be found between the sunburst visualization, the Sankey visualization and the parallel coordinates plot. For satisfaction, the Sankey chart shows superior results while the polar coordinates shows the worst outcome.

ANCOVA for evaluating our sum score for usability
After analyzing the multivariate model, we again take a look at the univariate model based on our calculated sum score for usability. The following table shows the obtained results based on the between-subject effects ( Table 17).
As already visible in the first analysis presented in Sect. 4.1.2, the sum score shows a reduced r 2 (compared to RT and SAT) and also, the effect size of the independent variables is downsized from previous medium effects to small ones. Pairwise comparison reveals that the Sankey visualization and the parallel coordinates plot show the highest scores while the polar coordinates plot shows the worst (significant p < 0.05). Further, we can again derive that irrespective of the visualization type used, the task type identify is more difficult to perform when Big Data visualizations are presented to the user than the task type summarize (especially when confronted with hierarchy based visualizations) (Table 18).

Discussion and implications
Reducing complexity by relying on a visual layout and enhancing problem-solving capabilities is the fundamental goal of visualizations (Ohlert and Weißenberger 2015). However, as datasets are increasing in complexity, conventional business charts (e.g. line, bar, and pie charts) are no longer sufficient and newer forms of visualizations especially designed to deal with this increasing complexity need to be applied (Perkhofer et al. 2019b). Within the discipline of Information Visualization many researchers are concerned with the creation of such new visualization types. Thus, a large pool of options already exists (Perkhofer 2019). However, what's missing, is to test their ability to inform and satisfy users within their explicit areas of application such as management accounting.
Thus, in this study we focus on a use-case that is based on data common within the discipline of management accounting and perform an experiment using knowledgeable participants in visual analytics. The dataset used was specifically created for this experiment to show a close-to-real data sample from a management-related perspective (Plaisant et al. 2008). Designing the data sample together with finance experts allowed us to draw from their experience and calculate metrics highly relevant for managerial accounting in a common trading company (in our case the wine trade). The contribution of this paper is the evaluation of multidimensional interactive visualizations for Big Data and its distinct components influencing usability in a large-scale quantitative experiment. By comparing four visualization types (the sunburst visualization, the Sankey visualization, the parallel coordinates plot, and the polar coordinates plot), three different task types (identify, compare, and summarize) as well as different interaction techniques (mostly static vs. interactive), their effect on decision-making (efficiency and effectiveness) and satisfaction could be evaluated.
Recommendations based on these findings should help to increase the usability of these four visualization types in particular, but can also be applied to other types available (Fig. 4).
Summarizing the obtained results, the visualization type polar coordinates shows below-average performance in all comparisons and should not be used, while the parallel coordinates plot (Cartesian layout) as well as the Sankey visualization (again Cartesian layout) performed best. Using the right visualization type has a strong effect on efficiency and a medium effect on satisfaction, while no effect on effectivity could be detected. This analysis also indicates that horizontal or Cartesian based layouts outperform Polar visualizations. Further, our results imply that visualizations representing hierarchies are easier to interpret and work with, than visualization showing multiple attributes, however, when identification tasks need to be performed they are inferior.
With respect to task type, results demonstrate that the task type influences effectivity and therefore the most important variable of usability. In more detail, the task type summarize shows significantly better performance than the task type identify. Overall, we can state that the visualization results in better performance, when it fits to the task-identify asks rather for visualization types that show data in disaggregated form (parallel and polar coordinates) while the task type summarize asks rather for visualizations that show data in an aggregated form. This is very much in line with previous research asking for a cognitive fit (Ohlert and Weißenberger 2015;Hirsch et al. 2015;Vessey and Galletta 1991).
Also clearly indicated by our results is the need for interaction when working with Big Data visualizations. The use of interaction has a strong effect on satisfaction and additionally, a medium effect on efficiency. Concerning our covariate usage, we can detect a significant influence on satisfaction while it has no effect on effectivity or efficiency. Results on all posted hypotheses are again presented and summarized in Table 19.
With respect to the evaluation methods used-multiple dependent variables versus one sum score-we could observe that a lot of explainability is lost by only looking at the sum score of usability instead of analyzing the three dependent variables effectivity, efficiency, and satisfaction independently. On the other hand, results based on the single score can be presented clearer and with respect to interpretation, no differences in recommendations to users are visible. From our perspective, this combined evaluation of the dependent variables and the sum score shows a clear picture on the effects tested in this study.

Limitations and further research
Of course, this study also includes some limiting factors that need to be discussed and considered when interpreting results. Limitations identified are an indication for further research opportunities that can be addressed in supplementary research endeavors: Limited number of visualizations used: As already explained, we only analyzed a subset of visualization options available, however, a huge pool of further possibilities exists. They range from additional forms of one comprehensive visualization, to the use of small multiples (multiple small visualizations put in juxtaposition such as parallel coordinates plot matrix or scatterplot matrix). Although the form of one comprehensive visualization is of relevance, it would be of special interest to investigate if a difference on mental demand and/or usability persists when using more than one visualization for displaying the dataset. These options need to be further explored.

Different interaction techniques depending on the visualization type:
Results on all visualization types are directly comparable as they were tested with the same questions and the same data dimensions or attributes respectively. However, interaction techniques were used according to common practice, leaving us with different concepts using a different mix of individual techniques. We decided on this approach to ensure high external validity, knowing that we might introduce possible limiting factors for internal validity. We did not want to depart too much from the real world and thereby introduce simplification for the sake of fulfilling all requirements for a valid scientific experiment. However, we do not imply that this is the better approach in general, but it fits our purpose. The most important requirement for this study is that the visualization and the task are generic but also realistic.
Use of MTurk: While the use of MTurk comes with the advantage of a large pool of possible participants and fast survey completion rates, there are also some related limitations. First, during data collection, we as researchers have no control over the process. Participants could be disturbed or interrupted, drawing away the necessary attention that might be needed to successfully fulfill the required tasks. Therefore, special attention needs to be paid in the design of the questionnaire and quality checks need to be implemented to sort good from bad. Second, most workers live in the United States and India, which might introduce a cultural bias. However, workers tend to be more educated than the general population and therefore more complex issues can be posted on MTurk, which was important for our study. Further, specific characteristics of the workers (e.g. the need for a bachelor's degree) can be linked to the posted HIT (human intelligence tasks) in exchange for higher payment. Despite the possible drawbacks, a comparative study which was posted on MTurk and also executed in the lab in the context of visual analytics produced comparable results (Harrison et al. 2014). For this initial stage of our research, we need answers to many manipulations (design, dataset, correlation type…) and we therefore believe MTurk to be an appropriate platform.
No information on information retrieval process: The way information is retrieved from a visual display gives a lot of indications on design problems. Controlled experiments using eye-tracking have proven to be particularly useful in providing insight in the data retrieval process (Falschlunger et al. , 2016a and might be able to shed further light on the specific design issues. We could not make use of a controlled environment by using the crowdsourcing platform MTurk. Thus, we cannot assume that all participants have participated under the same conditions, meaning the same speed of internet, the same display accuracy of the end device, as well as the same environmental conditions (e.g. silent surrounding). Further, having enough cognitive resources is of high importance in order to uncover insight. Measuring cognitive load directly, for example by relying on physiological measurement methods such as eyetracking (Perkhofer and Lehner 2019) or heart rate variability (Hjortskov et al. 2004), might allow for more reliable results than by measuring on self-reported data.
Effectivity is measured dichotomous: The use of questions that are either correctly or incorrectly answered could be the reason why low effects on effectivity are visible within the model. Asking different questions that allow the assessment of effectivity using finer distinctions rather than 0 and 1 could be beneficial to gain further insight on this measure.

Concluding remarks
In conclusion, Big Data visualizations allow to show a large amount of data in one comprehensive visualization, however, a special focus needs to be placed on their design (including an appropriate layout and interaction techniques) as well as on the task they are supposed to support. The presented new visualizations (sunburst, Sankey, parallel coordinates plot, and polar coordinates plot) are better directed towards showing transaction-based data. Moreover, management accounting is an interesting area of application, however, the lack of experience of a user leaves them at a disadvantage in terms of interpretation and satisfaction (Perkhofer et al. 2019b). Without knowledge (or stored schemas in long-term memory) on how to interpret these visual forms and knowledge on how to operate them, insight cannot be triggered (Sweller 2010). This necessitates a detailed focus on user-centered visual and functional design, a fact that has been largely neglected so far (Isenberg et al. 2013). This study is a first attempt at closing this gap.

Randomization check 3: task type
Randomization checks are only provided for significant pairwise comparison results in Sect. 4 (Table 33): • RA1: Identify-summarize