Evaluating Free Rides and Observational Advantages in Set Visualizations

Blake, Andrew; Stapleton, Gem; Rodgers, Peter; Touloumis, Anestis

doi:10.1007/s10849-021-09331-0

Evaluating Free Rides and Observational Advantages in Set Visualizations

Open access
Published: 15 April 2021

Volume 30, pages 557–600, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Logic, Language and Information Aims and scope Submit manuscript

Evaluating Free Rides and Observational Advantages in Set Visualizations

Download PDF

Andrew Blake¹,
Gem Stapleton ORCID: orcid.org/0000-0002-6567-6752²,
Peter Rodgers³ &
…
Anestis Touloumis¹

5917 Accesses
1 Altmetric
Explore all metrics

Abstract

Free rides and observational advantages occur in visualizations when they reveal facts that must be inferred from an alternative representation. Understanding whether these concepts correspond to cognitive advantages is important: do they facilitate information extraction, saving the ‘deductive cost’ of making inferences? This paper presents the first evaluations of free rides and observational advantages in visualizations of sets compared to text. We found that, for Euler and linear diagrams, free rides and observational advantages yielded significant improvements in task performance. For Venn diagrams, whilst their observational advantages yielded significant performance benefits over text, this was not universally true for free rides. The consequences are two-fold: more research is needed to establish when free rides are beneficial, and the results suggest that observational advantages better explain the cognitive advantages of diagrams over text. A take-away message is that visualizations with observational advantages are likely to be cognitively advantageous over competing representations.

Studying Biases in Visualization Research: Framework and Methods

Modality and Uncertainty in Data Visualizations: A Corpus Approach to the Use of Connecting Lines

Common Visualizations: Their Cognitive Utility

1 Introduction

This paper sets out to empirically test the belief that free rides (Shimojima 2015) and observational advantages (Stapleton et al. 2017) are features of visual modes of communication that aid cognition. Free rides occur when a given representation of information is translated into another and the resulting representation makes explicit some facts that must be derived (inferred) from the original. Such explicit facts are precisely the free rides. The generalisation of free rides to observational advantages removes the requirement for a translation. Instead, it allows the comparison of competing representations that are semantically equivalent: if one representation makes a fact explicit that must be inferred from another representation then that fact is an observational advantage of the former over the latter. In this paper, we study the potential cognitive benefits of free rides and observational advantages in the context of representations in textual form and diagrammatic form. The specific research questions we address are:

(RQ1)
does using text alongside (semantically equivalent) diagrams that are derived from the text, lead to significant performance benefits over just using text when identifying information that is conveyed by free rides?
(RQ2)
do diagrams lead to significant performance benefits over using text when identifying information that is conveyed by observational advantages?

RQ1 is designed to suggest whether free rides, where one representation of information needs to be translated into another to reveal facts that would otherwise have to be inferred, lead to cognitive advantages. RQ2 addresses the newer idea of an observational advantage, where there is no requirement for such a translation and, thus, no expectation that a user will be viewing multiple representations of information in different notations. Answering these questions is important for the design of visual modes of communication: if free rides and observational advantages yield demonstrable performance benefits then we should favour visualization methods that exhibit them as compared to competing representations. This paper specifically focuses on information about sets, visualized by linear diagrams, Venn diagrams, and Euler diagrams as seen in Fig. 1.

We now give a simple example. Suppose we have information about people who have visited various countries:

Everyone who visited Denmark visited Germany
No one visited both Germany and Mali
Everyone who visited Oman visited Denmark
Everyone who visited Uganda visited Denmark
No one visited Uganda and Oman.

From these statements, which correspond to subset (Everyone...) and disjointness (No one...) relations between sets, various facts can be inferred which are not explicitly stated. These inferences include Everyone who visited Uganda visited Germany and No one visited both Mali and Uganda. By translating the originally given five textually-expressed facts into the Euler diagram in Fig. 1, we make these two inferred facts explicit and, so, they are examples of free rides from the diagram. This is because, in the first case, the translation necessarily places the Uganda circle inside the Germany circle. In the second case, the Mali and Uganda circles necessarily do not overlap. Indeed, this Euler diagram also makes additional derivable facts explicit, such as No one visited both Mali and Oman, and therefore has many free rides. All of these free rides are also observational advantages, due to this being a more general concept.

Whilst diagrams can make some information explicit through free rides, there is a lack of empirical evidence that using text and diagrams in combination improves task performance (RQ1). For instance, providing two representations of information may increase the time taken to perform tasks without bringing any accuracy improvements. The first study on which we report presents participants with either just text, or text in combination with a diagram, allowing us to suggest whether the addition of a diagram, obtained by translating the text, more effectively supports the identification of information which must be inferred from the text but is a free ride in the diagram. Our hypothesis, in line with the widely held view, is that free rides do bring significant performance benefits. The second study in this paper addresses RQ2, by presenting participants with either just text or just diagrams. Answering RQ2 will suggest whether observational advantages yield significant performance benefits.

Focusing on the visualizations of sets is of particular importance because there are enormous amounts of set-based data available in a wide variety of application areas (Alsallakh et al. 2014). Reflecting this abundance of data, the research community is actively devising numerous methods for visualizing it. Set visualization techniques often exploit closed curves (or variations thereof) (Collins et al. 2009; Chow and Ruskey 2005; Meulemans et al. 2013; Riche and Dwyer 2010; Simonetto et al. 2009) or lines for representing sets (Alper et al. 2011; Cheng 2011; Gottfried 2015; Rodgers et al. 2015). This paper therefore focuses on such methods by evaluating Venn diagrams and Euler diagrams both of which use closed curves (Stapleton et al. 2011; Wilkinson 2012; Venn 1880), see the middle and right of Fig. 1 for Venn and Euler diagrams, as well as linear diagrams (which use line segments) (Rodgers et al. 2015), see the left of Fig. 1.

In linear diagrams, if one set-line occupies only x-coordinates that another set-line occupies then this asserts a subset relationship. For instance, in Fig. 1 the set-line for Uganda occurs only where Germany also occurs: everyone who visited Uganda visited Germany. Non-overlapping lines corresponds to set disjointness so, here, one can see that Mali and Uganda are disjoint. By contrast, the Venn diagram in the middle uses shading to assert set emptiness: the region inside both (the curve for) Mali and (the curve for) Uganda is entirely shaded, so the corresponding sets are disjoint. For subset relations, such as (informally) ‘Uganda is subset of Germany’, we see that the non-shaded region inside Uganda is entirely within Germany. Whilst Euler and Venn diagrams both exploit closed curves to represent sets, they have very different means of expressing information about those sets. In contrast with the use of an additional syntactic element – namely shading – in Venn diagrams, Euler diagrams use spatial relationships between curves. Our empirical studies will, therefore, test the role of free rides and observational advantages in explaining cognitive efficacy in three diagrammatic notations that convey information about sets in fundamentally different ways. This allows us to provide general insights, not results that are specific to one type of diagrammatic notation.

The paper is structured as follows. Section 2 covers free rides and observational advantages in the context of inference and the (potential) cognitive benefits of diagrams. We further illustrate free rides and observational advantages, in relation to observable information and meaning-carrying relationships in Sect. 3. Section 4 covers the methodology adopted for our study, including details of how the textual and diagrammatic stimuli were generated. It covers the tasks and training given to the participants, describes our data collection method and overviews the statistical methods employed. Section 5 presents our first study, evaluating the role of free rides, addressing RQ1. Section 6 covers our second study, evaluating the role of observational advantages, addressing RQ2. We conclude in Sect. 7. The supplementary material includes all of the experimental stimuli, the collected data, and the statistical models and output used for the analysis and is available from various urls that will be provided at appropriate places in the paper.

2 Free Rides, Observational Advantages and Inference

Free rides were introduced by Shimojima, with the idea dating back to his 1996 thesis (Shimojima 1996) so it is far from new. In his much more recent book, published in 2015, Shimojima states the following (Shimojima 2015):

Potential for Free Ride in Inference Expressing a set of information in diagrams can result in the expression of other, consequential information. This enables us to skip certain mental deductive steps and to substitute them with the task of comprehending the consequences from the diagrams.

He goes on to say, when discussing the extraction of derived meanings from text:

... we usually go through a rather lengthy process of (i) interpreting each of the ... sentences in the text, (ii) integrating the individual pieces of information thus obtained, and (iii) drawing a conclusion appropriate to the question.

and contrasts this with the case of diagrams^{Footnote 1}:

[information] can be derived ... given the various constraints holding on [the position of syntactic elements in] diagrams and the situations they represent

An entire chapter of his book is devoted to the potential for free rides in inference problems. In that chapter, he discusses the occurrence of free rides in both Venn and Euler diagrams, which we explain in the next section. At the heart of all the discussion is how a representation in textual form can be translated into a diagram to reveal ‘hidden’ information and the potential this has for facilitating inference. This insight is reflected in the first study in this paper, where participants will be presented with textual statements alongside a semantically equivalent diagram, obtained by translating the original text.

Shimojima claims that free rides are “advantageous for the purpose of making efficient inferences” (Shimojima 2015), where we take efficient to mean faster. He goes on to talk about the cost of inferences:

A free ride saves one the cost of a deductive inference to [a] valid consequence, but not the cost of recognizing and interpreting the source type that is automatically realized in one’s diagram.

Here, cost could be taken to mean either time savings or accuracy improvements. No evidence has yet been provided that free rides which occur in visualizations of sets do indeed yield significant performance improvements. This paper sets out to address this assumption by empirically evaluating free rides in a variety of diagrammatic representations of sets, as compared to natural language. In this empirical study, natural language forms the ‘original’ notation which is then translated into diagrammatic form.

Building on Shimojima’s novel idea of a free ride, more recent work has generalised this notion to that of an observational advantage (Stapleton et al. 2017). In this generalised case, there is no requirement for a translation from one notation into another that then reveals facts that would otherwise need to be inferred. Instead, the concept of an observational advantage allows us to compare the ‘advantages’ of one representation of information over another, semantically equivalent representation: if one representation explicitly represents a fact that must be inferred from the other then that fact is an observational advantage. This paper sets out to address the assumption that observational advantages lead to cognitive advantages by empirically evaluating their occurence in a variety of diagrammatic representations of sets, as compared to natural language.

3 Free Rides and Observational Advantages in Linear, Venn and Euler Diagrams

We are focusing on visualizing information about sets, using three types of diagrams. Here, we informally illustrate free rides and observational advantages evident in these diagram types by comparing them to set-theoretic sentences such as $X\subseteq Y$ and $X\cap Y=\emptyset $ (although in our empirical studies, as seen in the introduction, constrained natural language expressions of the form Everyone who visited X visited Y and No one visited both X and Y are used for the purposes of accessibility).

3.1 Meaning Carrying Relationships

To understand free rides and observational advantages more precisely, we need to consider the idea of a meaning-carrying relationship. Taking a set-theoretic sentence such as $A\subseteq B$, there is a unique meaning-carrier: the symbol A is written to the left of $\subseteq $ and B to the right. It is from this meaning-carrying relationship that the sentence $A\subseteq B$ conveys the information that A is a subset of B. More precisely, a meaning carrying relationship is a relation on the syntactic items in a statement that carries semantics and evaluates to either true or false (Stapleton et al. 2017). In our example, either it is true that A is a subset of B or it is not. In general, set-theoretic sentences have unique meaning-carriers. By extension, our constrained natural language expressions are also taken to have one meaning carrier and the sentence either makes a true statement or a false one.

Our three diagram types can also express $A\subseteq B$, seen in Fig. 2. In the linear diagram (left), the set-line for A only occupies x-coordinates shared with the set-line for B: this meaning-carrying relationship expresses $A\subseteq B$. In the Venn diagram (middle), the non-shaded region inside A is entirely within B and in the Euler diagram the curve A is entirely inside the curve B: again, these meaning-carrying relationships express $A\subseteq B$.

Linear, Venn and Euler diagrams, unlike symbolic set-theory, typically have multiple meaning-carrying relationships. For instance, in Fig. 1, the relationship between any pair of curves in the Euler diagram is a meaning carrier. Here, the inclusion of the Oman circle inside the Germany circle is a meaning-carrier, as is the non-overlapping nature of Denmark and Mali. The meaning-carrying relationships evident in linear and Venn diagrams are, in fact, in a direct correspondence to those we see in Euler diagrams. For example, in Fig. 1, we can see that the line in the linear diagram for Oman shares all its x-coordinates with the line for Germany, asserting that Oman is a subset of Germany. In the Venn diagram, the equivalent meaning-carrier is a little more subtle, perhaps: the non-shaded region inside Oman is also inside Germany. For disjointness information, the linear diagram ensures that the lines for Denmark and Mali do not overlap whereas, in the Venn diagram, the region inside both curves is entirely shaded.

3.2 Observation

Now we have intuitively introduced the idea of a meaning-carrier, we can consider what it means to be able to observe information from a representation: given a representation of information, R, a statement that is directly obtained from a meaning-carrying relationship is observable from R (Stapleton et al. 2017). In the simple example seen in Fig. 2, we can observe $A\subseteq B$ from each of the three diagrams from their previously stated meaning-carriers. In Fig. 1, again from the diagrammatic meaning-carriers just given we can observe ${ Oman }\subseteq Germany $ and $ Denmark \cap { Mali }=\emptyset $.

3.3 Free Rides

Putting all this together, we can more precisely state what is meant by free ride, although we refer the reader to Stapleton et al. (2017) and Shimojima (2015) for more complete descriptions: a free ride from one representation of information, say $R_1$, given a semantically equivalent representation, $R_2$ that is derived by translating $R_1$, is a statement that is observable from $R_1$ but not from $R_2$^{Footnote 2}. Thus, the two statements just given, ${ Oman }\subseteq Germany $ and $ Denmark \cap { Mali }=\emptyset $, are examples of free rides from each of the diagrams^{Footnote 3} in Fig. 1 when they are presented alongside an alternative set-theoretic representation:

1.
$ Denmark \subseteq Germany $
2.
$ Germany \cap { Mali }=\emptyset $
3.
${ Oman }\subseteq Denmark $
4.
$ Uganda \subseteq Denmark $
5.
$ Uganda \cap { Oman }=\emptyset $.

These five statements are the set-theoretic versions of the five textual statements given in introduction from which the three diagrams were derived. Thus, by extension, we have the fact that the textual statements

1.
Everyone who visited Oman visited Germany, and
2.
No one visited both Denmark and Mali

are free rides.

3.4 Observational Advantages

Using meaning carriers and observation, we are also able to give a more precise definition of an observational advantage: an observational advantage from one representation of information, say $R_1$, given a semantically equivalent representation, $R_2$, is a statement that is observable from $R_1$ but not from $R_2$. Thus, the two statements just given, ${ Oman }\subseteq Germany $ and $ Denmark \cap { Mali }=\emptyset $, are examples of observational advantages from each of the diagrams in Fig. 1; there is no assumption that these diagrams were derived by translating the set-theoretic representation or, therefore, that they are presented alongside that representation.

3.5 Summary

The meaning-carrying relationships in linear, Venn and Euler diagrams are essentially equivalent, even though these diagram types use very different syntactic conventions to represent information. From meaning-carriers, we can identify free rides, and observational advantages, in the context of alternative representations of information. On this basis, we can readily compare equivalent linear, Venn and Euler diagrams to constrained natural language statements about sets, to determine whether free rides and observational advantages bring about the cognitive benefits alluded to by Shimojima’s prior work.

4 Methods

To address our research questions, two empirical studies were conducted that measured task performance in terms of accuracy and time. This section describes the approach adopted to collect performance data, including the information being presented in textual form and visualized by diagrams, the tasks participants were asked to perform, the method used for data collection and the statistical methods employed for its analysis.

The first study presented participants with either textual statements or a diagram alongside the textual statements. For the second study, the diagrams were presented in isolation (so not in combination with text) and compared to the textual statements. In the studies, participants were asked to perform 20 tasks, the details of which are provided in what follows; these 20 tasks were presented in the performance phase of the study which was preceded by a training phase. Each task was a multiple choice question with five options, exactly one of which was the correct answer. Two options related to subset-style statements and two were disjointness-style statements. The fifth option was always ‘none of the above’. For associated study materials, see https://www.cs.kent.ac.uk/people/staff/pjr/freerides/paper.html and https://www.cs.kent.ac.uk/people/staff/pjr/observationaladvantages/paper.html.

4.1 Generating Set Relationships and Corresponding Textual Stimuli

For the studies, we needed to generate textual statements that would be used as task stimuli and that would be translated into diagrams. It was essential that the textual statements yielded diagrams that exhibited free rides (and, therefore, observational advantages). Moreover, to avoid ceiling and floor effects, the information contained in the statements should require cognitive effort to understand without being overly complex. This meant that a reasonable number of sets needed to be used in the statements. For instance, using just three sets would lead to very few diagrams that exhibited free rides and observational advantages and the information would be simple to interpret. Informal experimentation suggested that using five sets led to controlled variability in the diagrams and that sufficient free rides could be generated. The first pilot study that we conducted supported our belief that using five sets would lead to cognitive effort being required by the participants, but without causing undue hinderance to performance. That is, there was no obvious ceiling or floor effect; descriptive statistics obtained from the pilot study data, in the first study, are given in Sect. 5.

To limit the complexity of the statements, we included information about subset and disjointness relationships between pairs of sets only. So, the textual statements were of the form:

1.
Subset: Everyone who visited A visited B.
2.
Disjointness: No one visited both A and B.

Each task was based on five such statements that were randomly generated in the order in which they were to be presented to participants. In each case, the statement type was randomly generated (i.e. subset or disjoint), then the sets A and B were randomly selected from the five sets involved in the task, at this point simply called sets 1 to 5; their names were determined later. This gave us information about five sets in the form of subset and disjointness relationships. An example of five randomly generated statements can be seen in Fig. 3. In this case, all statements concern subset relations.

Each collection of five statements was required to conform to the following characteristics:

1.
the two sets A and B involved in a statement were never the same set; this ruled out trivially true assertions in the subset case and empty sets in the disjointness case,
2.
if $A\subseteq B$ was asserted then $B\subseteq A$ was not asserted; this prevented sets being equal,
3.
if $A\subseteq B$ or $B\subseteq A$ was asserted then $A\cap B=\emptyset $ was not asserted; this also prevented sets being empty,
4.
for $1<i\le 5$, the $i^{th}$ statement could never be inferred from the preceding statements; this meant each statement contained new information that was not already given by the statements generated (and written down in the question) before it.
5.
the five statements had to give rise to at least one free ride and, therefore, observational advantage.

The first four requirements were to avoid the potential for confusion amongst participants. The last requirement was essential for the purposes of the study.

There were also requirements that had to be met by the 20 collections of five statements. Firstly, no two sets of five statements were isomorphic (i.e. the informational content was never the same up to the chosen set names). Including isomorphic sets of statements could impact task performance due to increased participant familiarity with the essential structure of the information conveyed. Further requirements arose because we also needed two categories of task: 10 tasks were about identifying subset statements were true (i.e. participants would need to identify that Everyone who visited A visited B was necessarily true) and the remaining 10 tasks were about disjointness statements (i.e. participants would need to identify that No one visited both A and B was necessarily true). Therefore, we needed 10 sets of statements that had a subset free ride and 10 sets of statements that had a disjointness free ride.

Once a set of 20 collections of five statements was generated that had the requisite properties, names were assigned to the sets. Country names were used, with no two names starting with the same letter. From a set of 26 such names, five were randomly selected for each task. They were allocated to sets 1 to 5, used in the statement generation, so that they first appeared in the five statements in alphabetical order. Figure 4 shows the result of assigning five randomly selected set names to the statements in Fig. 3.

4.2 Creating Diagrams

The 20 sets of statements generated for the studies were used to create linear, Venn and Euler diagrams. Where possible, the layout features were kept consistent across notations but due to their syntactic properties some differences are inherent. Linear diagrams were drawn with with straight line segments, Venn diagrams with ellipses, and Euler diagrams with circles. All three diagrammatic notations employed the following common layout features:

1.
Colour: each set was assigned a unique colour, with a set of five colours generated by colorbrewer (Harrower and Brewer 2003). The colours were chosen to ensure they were visually distinguishable and suitable for categorical data. These colours were then used in the linear, Venn and Euler diagrams to colour the lines, ellipses, and circles respectively. It is known that the use of colour in this way improves the effectiveness of Euler diagrams (and therefore Venn diagrams) (Blake et al. 2016). For linear diagrams, using colour in this way does not significantly reduce (or improve) performance as compared to using monochrome (Rodgers et al. 2015). The five colours were assigned in a fixed order and then, for each set of statements, allocated to the sets in alphabetical order.
2.
Font: each set name was written in times roman font, size 12, matching the textual statements. The name was assigned the same colour as its associated set and, thus, line, ellipse, or circle.
3.
Line thickness: each line, ellipse and circle was 3.85 pixels wide.

An example is given in Fig. 5. It can be seen that the colours assigned to each set are the same across notations, that the fonts match and take the same colours as their associated line or curve, and that the line thickness are the same.

4.2.1 Linear Diagram Layout Features

The linear diagrams were drawn using the layout guidelines in Rodgers et al. (2015). Each set-line was drawn horizontally with few line breaks. Vertical grid lines were used to mark the start and end of overlaps; an overlap corresponds to a particular set intersection, such as the rightmost overlap in the linear diagram of Fig. 1 which represents the set of people who visited Denmark, Germany and Uganda but not Mali or Oman. The grid lines can also be seen in Fig. 5. The sets were ordered alphabetically from top to bottom, which meant that the colours always appeared in the same top-to-bottom order, as is evident by comparing Figs. 1 and 5.

4.2.2 Venn Diagram Layout Features

Each Venn diagram comprised five ellipses and had a symmetric layout. A fixed shade of grey was used to indicate the emptiness of sets. The set names were assigned to the ellipses alphabetically in a clockwise direction starting from the top of the diagram. This meant, given that colours are assigned to set names in alphabetic order, that each Venn diagram used in the 20 performance phase questions differed from the others only by the regions which were shaded and the names of the sets, as is evident by comparing Figs. 1 and 5.

4.2.3 Euler Diagram Layout Features

Each Euler diagram was drawn using circles of a range of sizes. Circles are known to be a cognitively effective shape (Blake et al. 2016). Set names (labels) were positioned so that they were near the outside of their associated circle. Labels did not obfuscate each other. Where possible, the labels did not overlap with a circle. The regions formed by the circles did not have overly small areas.

4.3 Tasks and Training

As stated above, each question was multiple choice with five options, exactly one of which was correct. One option was ‘None of the above’. The other four options always included two ‘Everyone ...’ statements and two ‘No one ...’ statements.

Ten of the 20 tasks required the identification of subset-style statements. The remaining ten tasks corresponded to disjointness-style statements. The 20 sets of five statements were thus divided into two sets of ten. In some cases, a set of five statements exhibited only subset free rides and observational advatnages (‘Everyone ...’ statements) and so were assigned to the ‘subset’ task type. Similarly, in other cases only disjointness free rides and observational advantages (‘No one ...’ statements) were present and so the five statements were assigned to the ‘disjointness’ task type. The remaining sets of five statements were randomly divided between the two categories whilst ensuring ten tasks in each.

Given the allocation of task types to sets of statements, we had to choose a statement to be the correct answer. In each case, one of the statements in the appropriate category was randomly selected. The sets involved in the other three options were randomly chosen whilst ensuring that the information in the associated option could not be inferred from the original five statements (i.e. the incorrect options were not necessarily true).

Having identified the correct answer and three incorrect options for each question (as well as a further incorrect option, namely ‘None of the above’), we paid particular attention to the order in which the options were presented. To control the variability between subset and disjointness tasks, each task type had 2 correct answers as option one, 3 correct answers as option two, 3 correct answers as option three, and 2 correct answers as option four. The remaining four (incorrect) options were randomly ordered around the correct answer except that ‘None of the above’ always appeared last. A screenshot from the first study, addressing RQ1, can be seen in Fig. 6 where the correct answer is option 1.

In order for participants to be able to perform the tasks, initial training was provided. This comprised a series of four tasks. The first task used just three sets and the correct answer was a disjointness-style statement. The second training task used four sets and was in the subset category. The final two training tasks used five sets, making them similar to the tasks used in the study, one for subset and one for disjointess. The screenshot in Fig. 7 shows how a training question was presented in the first study, using text and an Euler diagram representation, in the case of three sets. Figure 8 shows the corresponding explanation given after the participant submitted their answer. Training was similar for the other groups in the first study, differing due to the nature of the syntax in the representation. For the second study, the training removed reference to the textual statements when participants were exposed to one of the diagrammatic treatments, as in Figs. 9 and 10. The 20 performance phase questions were presented similarly to Fig. 6. The training material and all of the performance material can be found in the supplementary files.

4.4 Data Collection Method

Both studies adopted a between group study design with four groups. The first study, for RQ1, included: text-only, linear diagrams with text, Venn diagrams with text, and Euler diagrams with text. The second study, which ran at a later time, for RQ2 included: text-only, linear diagrams, Venn diagrams and Euler diagrams. For each study pre-screening was used, described in the relevant sections below.

Participants were randomly assigned to one group. Prolific Academic was used to crowdsource participants from the general population. It is recognised that in some crowdsourced studies, participants do not always give questions their full attention, or have difficulties with the language used, and this is hard to control (Chen et al. 2011). We call such participants inattentive. A common technique to identify inattentive participants is to include questions that are trivial to answer. An example, from the linear diagrams with text group, is in Fig. 11. It can be seen that the participant was told it was an attention checking question, the first four options used country names that did not appear in the textual statements or diagram, and the last option instructed them to choose that one. The presentation for the questions designed to identify inattentive participants was different in the second study, with details given later. Our studies each included two attention checking questions, for each group, and they always appeared as the 7th and 14th questions in the performance phase. A participant was classified as inattentive if they answered either of these two questions incorrectly; their data were not analyzed.

In both the training phase and the performance phase, each question was displayed on a separate page. Participants could not return to pages and subsequent pages were not revealed until the previous answer was submitted. Unlike the training questions, which were presented in the same order for all participants, in the performance phase the questions were randomly ordered. Participants were instructed to maintain their concentration on the study and to answer questions without delay, unless a question explicitly said otherwise (these were the attention checking questions). The full information provided to participants is given in the supplementary material.

4.5 Statistical Methods

Recall that we are collecting accuracy and time data as indicators of performance. Accuracy is viewed as more important than time: one representation of information is judged to be more effective than another if users can perform tasks significantly more accurately with it or, if no significant accuracy difference exists, performance is significantly quicker. For each study, we employed two generalized estimating equations models (Liang and Zeger 1986) to analyse the accuracy data. An ANOVA calculation was not appropriate as the data violated the normality assumption. The non-parametric version of ANOVA, Kruskal-Wallis, was also not appropriate, as the responses for each individual are correlated, and so not independent.

For the time data, for each study we used two generalized estimation models (Liang and Zeger 1986) that allowed us to estimate whether the time taken to provide answers was significantly different. Again, alternative statistical tests, such as ANOVA, were not appropriate as assumptions were violated by the data.

Full details of the models and the statistical output can be found in the Supplementary Material. Whilst we view accuracy as the most important indicator of performance differences, all analysis that was performed is reported in the paper. Throughout, results are declared significant if $p\le 0.05$.

5 Free Rides Study: Diagrams as a Support for Text

This section addresses RQ1, which is broken down in to the following more specific questions:

RQ1a: Do the free rides exhibited by the considered diagram types, when presented in combination with text, bring about significant task performance benefits over text alone?

RQ1b: Do the free rides exhibited by any one diagram type lead to significant task performance benefits over the other diagram types?

RQ1c: Do tasks concerning subsets lead to significantly different task performance compared to those concerning disjointness?

In reporting on our data collection and results, we will refer to the treatments as

T for ‘text only’,
L&T for linear diagrams and text,
V&T for Venn diagrams and text, and
E&T for Euler diagrams and text.

The online version of this study, where it is possible to select the group in which to take part (unlike the actual study where participants were randomly assigned to a group) can be found here: https://www.cs.kent.ac.uk/people/staff/pjr/freerides/. An example of the stimuli for one study task can be seen in Figs. 12, 13, 14 and 15, with the associated options in Fig. 16. The correct answer is option 3, so this is a disjointness-style task.

When running a pilot study^{Footnote 4}, we pre-screened participants using the following criteria:

1.
they had to have a Prolific approval rate of $97\%$ or higher, and
2.
they had to have completed at least 5 studies on Prolific previously.

This left a pool of 25, 313 potential participants, out of 58, 462, so over half were disqualified. Each participant was allowed a maximum of 45 minutes to complete the study, with an expected completion time of 20 minutes, and was randomly allocated to one of the four groups. They were each paid £2.61, reflecting the time we expected it to take to complete the study.

A total of 41 people began the pilot study. Of these, five were classified as inattentive, two timed-out after 45 minutes and a further four (all in the Venn group) withdrew before completion. This left data from 30 participants. Prolific indicated that it took participants on average 23 minutes to complete the study which includes all time spent on training, performing the tasks and supplying demographic information. For the pilot, the overall accuracy rate and the average (mean) time in seconds to answer each question (in seconds) are given in Table 1, alongside a breakdown for each group; the low number of participants in the Venn group is likely due to a combination of the random allocation and the four withdrawals. The overall data do not indicate a ceiling or floor effect and the range of accuracy rates and mean times suggest that there may be differences across the groups.

Table 1 Summary of the pilot data

Evaluating Free Rides and Observational Advantages in Set Visualizations

Abstract

Similar content being viewed by others

Studying Biases in Visualization Research: Framework and Methods

Modality and Uncertainty in Data Visualizations: A Corpus Approach to the Use of Connecting Lines

Common Visualizations: Their Cognitive Utility

1 Introduction

2 Free Rides, Observational Advantages and Inference

3 Free Rides and Observational Advantages in Linear, Venn and Euler Diagrams

3.1 Meaning Carrying Relationships

3.2 Observation

3.3 Free Rides

3.4 Observational Advantages

3.5 Summary

4 Methods

4.1 Generating Set Relationships and Corresponding Textual Stimuli

4.2 Creating Diagrams

4.2.1 Linear Diagram Layout Features

4.2.2 Venn Diagram Layout Features

4.2.3 Euler Diagram Layout Features

4.3 Tasks and Training

4.4 Data Collection Method

4.5 Statistical Methods

5 Free Rides Study: Diagrams as a Support for Text

5.1 Comparison of Representations

5.1.1 Subset Comparison Across Treatments

5.1.2 Disjointness Comparison Across Treatments

5.1.3 Summary

5.2 Comparison of Task Types with Treatments

5.3 Free Rides Study: Interpretation

6 Observational Advantages Study: Diagrams as a Stand-alone Representation

6.1 Comparison of Representations

6.1.1 Subset Comparison Across Treatments

6.1.2 Disjointness Comparison Across Treatments

6.1.3 Summary

6.2 Observational Advantages Study: Comparison of Task Types

6.3 Observational Advantages Study: Interpretation

7 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation