An enormous amount of various data is produced every day. With proper data visualisation, an information hidden in the data can be easily and quickly revealed. It is necessary to create a communication channel that could quickly and efficiently transfer the information from the data to the user. By using visual elements like charts, graphs, and maps, data visualisation is an accessible way to see and understand trends, outliers, and patterns in data. This chapter offers an overview of relevant data visualisations divided into thematic categories and supported by examples.
KeywordsVisualisation Data Chart Information
In the world today, we encounter enormous amounts of data every day. To convert data into useful information, data must be presented to the user in a way that allows interpreting, analysing and applying the gained information (Yau 2011). It is necessary to create a communication channel that could quickly and efficiently transfer the information from the data to the user – this can be done with data visualisation. Tableau Software, a company offering a software platform for interactive data presentation, briefly and comprehensively talks about data visualisation: “Data visualisation refers to the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualisation is an accessible way to see and understand trends, outliers, and patterns in data” (Tableau Software 2018). According to (Friedman 2008, p. 1) the “main goal of data visualisation is to communicate information clearly and effectively through graphical means. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects more intuitively”.
Visualisation is an important step in the whole process of data analysis. Legendary statistician John Tukey often mentions visualisation in the context of using visualisation to find meaning in the data (Tukey 1977). Despite his statistical focus, he believed that a graphic presentation of information plays an immense role. A proper visualisation based on source data can help to understand the data, improve decision making and provide a more objective preview of the problem represented by data (Yau 2013). A graphic can also reveal hidden patterns and relationships.
Visualisation methods have gone far beyond traditional data presentation with simple charts and graphs. Modern trends approach data visualisation as both a science and an art. Of course, certain standards of correctness (e.g. by choosing a method according to the characteristics of the data) are still kept, but there is an effort to make the result interesting and catchy to attract the reader’s attention. Sophisticated data visualisation and infographics methods offer a variety of exciting charts and diagrams. The advantage of technologies today is the possibility of presenting outputs in the form of online interactive web tools, which makes the processing of the information, that author attempts to communicate, even more intuitive and attractive.
In this chapter, a non-spatial data and its visualisation are discussed. Non-spatial data plays an undeniable role in the field of economics and business intelligence. For that reason, an overview of most common and powerful possibilities how to visualise it will be presented on the following pages.
Nowadays, a variety of software is easily available, knowledge of some of them is a part of general digital literacy. Almost everyone, who somehow uses a computer, is able to create any visualisation using some of the available software. Most of the computer users are skilled with Microsoft Office Excel – software that doesn’t need to be presented (or its opensource alternatives Libre Office/Open Office). Working in these tools is relatively convenient and straightforward, as underlying data and graphical tools are integrated into one user environment, and the whole process is very intuitive. However, this approach does not always offer proper or high-quality graphical outputs and supports the user’s tendency to blindly insert data into the provided graphics templates without deeper thinking. By this approach, data loses its ability to interpret the story that is stored in it (Nussbaumer Knaflic 2015). Another point is the technical maturity of the output. In the world of modern technologies, where most of the information is distributed online, it is much more professional to produce outputs that offer a degree of interactivity and support simple distribution in the digital environment. Interactivity allows the viewer to engage with your data in ways impossible by static graphs. With an interactive plot, the viewers can zoom into the areas they care about, highlight the data points that are relevant to them and hide the information that is not (Barter 2017). For this reason, some tools will be introduced offering the possibility of creating interesting graphical outputs.
8.1.1 Tableau Software
The company Tableau Software offers a set of tools of the same name designed for exploratory analysis and data visualisation. The product is especially focused on an effective and highly aesthetic level of visualisation, which undoubtedly attracts many customers. The full version of the program is paid, but the version Public Tableau is freely available. In this version, a user can work with many formats such as Microsoft Excel, Access, text files, JSON files, databases and also spatial data (several data formats are supported). After loading the data, the user can easily select the attributes they want to visualise and based on the data type, a set of options is automatically offered to create a visualisation. The main idea of the Public Tableau is the interactivity and presentation of the outputs in the online environment so the result can be shared with other users as attractive interactive data visualisation. The tools are, of course, multiplatform, they can be used as a desktop, mobile or online version.
It is free and open-source statistical and mathematical computing software, primarily focused on data analysis and modelling. Since R has been developed mainly for statistical analysis, it has a solid background for different types of calculations suitable for data analysis. There is a lot of packages, which can extend the functionality of R software with just a simple code command. Thanks to the packages, R is a very mighty tool for data visualisation. Of course, a knowledge of code writing is required (as well as with HTML), which makes R for many people inapplicable. But once this obstacle is overcome, a new world of data handling and visualisation is opened. All graphics can be saved in vector formats, so it is possible to edit and refine the design of the outputs in suitable graphical software, like Adobe Illustrator or Inkscape. Except for traditional static graphic, also interactive outputs can be produced with special R packages. Sometimes, the interactivity is redeemed by complexity in the form of one extra line of code!
Datawrapper is an online tool for making the interactive charts. It has a very simple interface; a user can upload data from a file or paste the value directly into the field. The tool generates graphics automatically; a user can choose one of the 16 types of visualisation. Several refining steps can be done, like customising of axis, labelling or colour setting. This tool is an ideal solution when one needs a quick, simple interactive visualisation without any programming.
An example of visualisation tools
Free with some paid plans
Easy to handle
Easy to handle
Free, extra features paid
Easy to handle
Easy to handle
Easy to handle
Easy to handle
Easy to handle
Easy to handle
8.2 Charts Classification
There might be a confusion in terminology regarding the visualisation of non-spatial data. Usually, words ‘chart’ or ‘graph’ are generally used to describe any visual output. For many people, these two terms mean the same, but there is a difference. A chart is a superior term for a group of methods, how to present information. A graph is a particular graphical tool, which shows a mathematical relationship between sets of data (Blaettler 2018). With this approach, a graph is a subcategory of a chart. For this reason, the term chart will be rather used in this chapter, to keep the description of different methods more board.
Different types of charts will be described in the following chapter. Since there are dozens of possibilities of visualisations, only the most interesting or most commonly used variants will be introduced. For better thematic logic, the individual methods were divided into thematic groups. The inspiration for this system was the book Visualize This (Yau 2011) and the website www.datavizproject.com (Ferdio ApS 2017).
8.2.1 Trend Over the Time
Time series are typical data for many phenomena. Things are changing in time, and this change can be easily captured and presented by suitable graphics. Talking about time series, users try to explore the trend in data. Is the value of the phenomena increasing or decreasing? Are there any repetitive cycles?
Temporal data can be divided into discrete and continuous types. The knowledge about this character of data should guide the user in a decision, which kind of graph should be used. For example, a monthly revenue report is an information referenced to a one-time step – a month, so this can be considered as a discrete phenomenon. Then, a simple bar or point graph can be used. The second type is the continuous data. This is kind of information which can be measured at any time of day during any day of the year. A typical example could be a temperature or another meteorological phenomenon; regarding the economic data, we can use stock exchange prices as an example. The structure of data is same for discrete and continuous phenomena, to distinguish the difference, the proper way of visualisation should be used. The most primitive solution is to connect discreetly plotted data with any line.
22.214.171.124 Bar Chart
126.96.36.199 Point Chart
Point chart works on same principle as the bar chart does, except for used geometrical element – it is a simple point here. This can sometimes be more suitable since the points do not represent such graphic content and load as bars. Point chart is also known as a scatterplot when non-temporal data is used. It is crucial to properly create an axis representing the value of the phenomenon, as there is no other way to find out the value.
188.8.131.52 Line Chart
184.108.40.206 Step Chart
220.127.116.11 Gantt Chart
Proportion data is grouped by categories/types. Each category represents a possibility, which is part of the certain unit. This distribution of proportions is the most important information for comparing groups between themselves. With proportional visualisation, questions like “Are all of the categories equally represented? Is there any category which dominates?” can be answered.
For this type visualisation, a data needs to have a form of proportions that add up to 1 or 100%. Every part could be stored relatively (as a proportion) and absolutely – total values allow to compare not only proportional part but also total size/amount in different categories.
18.104.22.168 Pie Chart
A pie chart is one of the most often used charts and is typical for an explanation of proportions. The circle which is representing the whole is divided into sectors. The arc length of each segment (or interior central angle, or area) is illustrating the proportion of individual categories. All categories together must form a unit/100%.
22.214.171.124 Doughnut Chart (Fig. 8.7)
According to some of the resources (Nussbaumer Knaflic 2015), the pie or doughnut chart is an inappropriate way how to visualise proportional data. This is caused by the greater difficulty of perceiving angles or area than distances (which are the key information regarding, e.g. bar charts), it is a common property of human eye perception. In a situation when two or more categories are represented by an approximately same value, it’s difficult to decide which one is greater. This issue can be solved by adding labels. Still, several authors recommend using different proportional methods, like a stacked/simple bar charts.
126.96.36.199 Stacked Bar Chart
Instead of pie/doughnut charts, simple bar chart ordered from highest value to least can be used. All bars have the same baseline; the endpoint is easier to compare. Even small differences can be distinguished. The length of bars is recalculated in that way that their sum equals to whole/100%.
188.8.131.52 Tree Map/Area Chart
Regarding the tree map or area charts, there is the same issue with the perception of two-dimensional object as was discussed in pie chart paragraph. In this case, if the area map has a cell-based regular structure, the perception of information can be done correctly by simple counting of cells. Nussbaumer Knaflic (2015, p. 59) describes another situation when area charts are quite helpful: “when visualisation of numbers of vastly different magnitude is needed. The second dimension you get using a square for this (which has both height and width, compared to a bar that has only height or width) allows this to be done in a more compact way than possible with a single dimension”.
8.2.3 Relations and Correlation
There are many ways how to quantify relations between several variables/group. A statistical approach provides mathematical tools, such as correlation or regression (if a conditions regarding the characteristics of variables are fulfilled). Sometimes it is much easier just to plot the data to reveal the hidden relations. A correlation simply describes, how two variables change together. Sometimes it is forgotten that correlation doesn’t equal causation. Basic correlation of two variables expressed with chart can quickly describe the behaviour of the data, a rate of relation can be estimated, maybe a clustering tendency can be discovered.
184.108.40.206 Bubble Plot
220.127.116.11 Scatterplot Matrix
8.2.4 Differences and Comparison
Comparing a single variable is not a demanding task, the value of every record is displayed by one of the previous-mentioned methods and analysed. Bar charts or simple point charts may well serve to this task. Considering two or three variables, several charts for this type of visualisation have been introduced in the previous sections. Regarding the data with more variables, known as multidimensional data, different graphical methods have to be used.
18.104.22.168 Paralel Coordinates
8.2.5 Statistical Charts
22.214.171.124 Distribution Plot
8.3 A Good Design
In the beginning, the data analyst has to know the data in detail. Once the analyst understands what kind of information is hidden in the data, what is the data type and character, he can decide which type of chart is the best solution for proper visualisation. Then another step of chart designing follows. The raw default output from the software is not wrong, but usually, it is also not the most attractive result. With an additional improvement of the graphics, information which the author tries to deliver with the chart might be easier to perceive.
It must be always considered, who is the audience, the reader of the created chart, for which purpose is the chart created. By design, the author can manipulate with the way how the chart is read. If it wants to focus on a significant trend, the axis and labels can be de-emphasised with grey colour, and the primary trend line is highlighted. Then, the trend is the information which draws the attention at first. On the other hand, a chart designed with the purpose of reading exact values must have readable and accurate labels of all axes.
Generally, some recommendation can be made. Mostly modest colours should be used. Some of the colour schemes can evoke emotions or feel (e.g. red colours indicate activity that should be addressed; neutral pastel colours means that all features in the chart are equal etc.). Proper labelling should be done – through user might know the context in which is the chart placed, he doesn’t know the meaning of every single element of the chart. Therefore, a title of the chart, axes names and value labels and legend with explanations of colours should be a part of the visualisation. Geometrical aspects are also important. Sometimes it’s more suitable just to rotate the chart, what makes it much easier to read (e.g. bar chart with long category names – rotation to horizontal is more natural, because it follows the way how we read the common text). A different spatial arrangement of geometrical features can solve the issues with blank space or can fit better into a whole graphic design (text or poster). Transforming the geometrical elements into pseudo-3D and displaying data that way should be avoided (the only exception is plotting a three-dimensional data with a 3D plot). Unfortunately, for example, visualisation of the pie chart in 3D is quite popular. As discussed above, the pie chart is not always a good choice for visualisation of proportional data; the combination with 3D makes it much more difficult to read or compare with others pie charts because the third dimension is problematic for perception. A perspective in 3D visualisation can also be misused for promotion – a segment of pie chart placed in the foreground looks larger than a segment of similar size in the background.
- Barter, R. (2017). Interactive visualization in R. http://www.rebeccabarter.com/blog/2017-04-20-interactive/. Accessed 13 Nov 2018.
- Blaettler K (2018). The difference between charts & graphs. Sciencing. https://sciencing.com/difference-between-charts-graphs-7385398.html. Accessed 2 Jan 2019.
- Ferdio ApS. (2017). Data viz project. https://datavizproject.com/
- Friedman, V. (2008). Data visualization and infographics. Smashing Mag.Google Scholar
- Tableau Software. (2018). What is data visualization? A definition, examples, and resources. https://www.tableau.com/learn/articles/data-visualization. Accessed 30 Oct 2018.
- Tukey, J. W. (1977). Exploratory data analysis. Reading: Addision-Wesley Publishing.Google Scholar
- Yau, N. (2011). Visualize this: The flowingdata guide to design, visualization, and statistics. Indianapolis: Wiley.Google Scholar
- Yau, N. (2013). Data points – Visualization that means something. Indianapolis: Wiley.Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.