This chapter will first address data visualization and then discuss the relationship between data visualization and aesthetics. It discusses the definition of data and information and the forms and characteristics of traditional data visualization, emphasizes on understanding of meaning of data in effectiveness and efficiency. And then this chapter outlines some key data visualizations, which includes Trees, Scatter plots, Charts, Tables, Diagram, Graphic, Waveform, Simulation and Volume.

2.1 The Concept of Data Visualization

The term “data visualization” has a long history dated back the 2nd century AD. At the ancient society, drawings and other visual representations were used to investigate the world and also to record the historical events. Data visualization has significantly contributed to invention and discovery throughout human history (Crapo, Waisel, Wallace, & Willemain, 2000). The invention of computer technology makes the huge change on the way of visual representation of data. Data analyst has become quicker and more accuracy using computer graphical data visualization. Data visualization has become an important part of research in many fields including algorithms, human perception, animation, computer vision and so on. Data visualization is usually associated with the field of computer science in the contemporary society. As an emerging field, it is considered a sub-classification of visualization and is regarded as “the science of visual representation of ‘data’” (Friendly, 2009, p. 2). The technology of data visualization has evolved from using hand drawing in the earliest stages, to “photo-etching”, to using computer technology, such as computing graphics and software (Friendly, 2009). In particular, the development of computer software has advanced the application of data visualization, allowing users to manipulate a substantial amount of data for exploration and analysis in an easier and more affordable way. In this book, data visualization that focuses on the communication for understanding data rather than any other approaches is considered as traditional data visualization.

Traditional data visualization has a number of advantages. To begin with, it has the ability to represent a vast amount of data immediately. Secondly, it enables viewers to identify emergent properties (e.g. patterns) in the data immediately for formulating new insights. The third advantage is that it can be used for product quality control where the immediate identification of problems is made possible through the data analysis. The fourth is that it enhances the understanding of large-scale and small-scale data. In this regard, Gray, Mayer, and Hughes (cited in Ware, 2012, p. 3) suggest that data visualization assists in the constructing of hypotheses.

Data visualization often results in graphical images of data or concepts, which assists making decisions (Ware, 2012). The development of computing technology facilitates data visualization, identifying useful information or deriving insights from the graphical images. The importance of data visualization is described as follows:

The success of data visualization is due to the soundness of the basic idea behind it: the use of computer-generated images to gain insight and knowledge from data and its inherent patterns and relationships. A second premise is the utilization of the broad bandwidth processes, and simulations involving data sets from diverse scientific disciplines and large collections of abstract data from many sources. (Post, Nielson, & Bonneau, 2003, back cover)

This statement emphasizes that data visualization as scientific research relies on computing technology and its utilization for the process of information. For this research, it is necessary to initially define the concept of data visualization in order to explore and identify the key forms and characteristics for designing a theoretical framework for Taoist data visualization, which will be discussed in Chap. 7.

2.1.1 The Definition of Data Visualization

The term “data visualization” can be defined in several ways. Most definitions focus on the connection between data and computer technology in order to transform data into a visual or sonic form. Card, Mackinlay, and Schneiderman (1999) define data visualization as “the use of computer-supported, interactive, visual representations of data to amplify cognition” (p. 6). Manovich (2010) defines it as “a transformation of quantified data which is not visual into a visual representation” (p. 20). According to Friendly (2009), data visualization refers to “information which has been abstracted in some schematic form, including attributes or variables for the units of information” (p. 2). Data visualization involves an information exchange that includes the messenger, the receiver, and the message (Kirk, 2012). Kirk (2012) defines data visualization as “the representation and presentation of data that exploits our visual perception abilities on order to amplify cognition” (p. 17). It emphasizes that the design of data visualization requires representing data in an effective and efficient form. Visual representation of data is a key element for these definitions. The purpose of data visualization identifies patterns inside the graphic through exploring and analyzing data. Thus, Bikakis (2018) defines data visualization as the following:

Data visualization is the presentation of data in a pictorial or graphical format, and a data visualization tool is the software that generates this presentation. Data visualization provides users with intuitive means to interactively explore and analyze data, enabling them to effectively identify interesting patterns, infer correlations and causalities, and supports sense-making activities.

From mathematical perspective, it refers to an “understanding of the real number line, time, measurement, and estimation” as well as an “understanding of ratio concepts, notably fractions, proportions, percentages, and probabilities” (Reyna, Nelson, Han, &, Dieckmann, 2009).

Numerous visualization software or applications have been developed and the main task is exploring, visualizing and analyzing data. In modern time, data visualization involves four aspects, which refer to real-time interaction, on-the-fly processing, visual scalability and user assistance and personalization (Bikakis, 2018). As Bikakis (2018) explains that real-time interaction requires efficient and scalable techniques that “should support the interaction with billion objects datasets, while maintaining the system response in the range of a few milliseconds.”(p. 2). For on-the-fly processing, the support of on-the-fly visualizations over large and dynamic sets of volatile raw (i.e., not preprocessed) data is required” (Bikakis, 2018, p.2). When addressing information overloading problems, effective data abstraction mechanism is necessary. A key feature of the modern visualization is user comprehension and customization capabilities to various user-defined exploration scenarios and preferences (Bikakis, 2018).

Data visualization can be categorized into two major sub-fields: information visualization and scientific visualization (Marai, 2010; Post et al., 2003). Information visualization is used to visually represent abstract data, such as business data (Card, Mackinlay, & Schneiderman, 1999; Spence, 2007), while scientific visualization represents scientific data, which are usually physically based (e.g. the human body, the environment or the atmosphere) (Spence, 2007). Both information and scientific visualization focus on how to transform data into a visual form, to become understandable information for gaining insight and knowledge. Figure 2.1 presents the fundamental process of data visualization, in which data in any form can be transformed into graphical images. When a user reads or looks at a graphical image, the image is interpreted through the human cognitive system for the acquisition of insight or the apprehension of useful information.

Fig. 2.1
figure 1

The process of data visualization, illustrated by Yaqin Fu

Central to the process of data visualization is the transformation of data into information. Understanding the differences between the concepts of data and information, as well as their relationship is crucial to understanding the process of data visualization. The difference between data and information is outlined in the next section.

2.1.2 The Definition of Data and Information

The definition of data may vary depending on the context of the disciplines and the application of data. Data usually refers to raw, unprocessed information that is not recognized as having any meaning. Data can be organized in a variety of forms, such as name, numbers, age, signs, characters or symbols (Bellinger, Castro, & Mills, 2004; Yeung, 1998; Zins, 2007) and files, reports, graphs in a business context (“Concepts in information and processing,” n.d.). According to Ackoff (1989), data is symbols, while other researchers define data as “computerized representations of models and attributes of real or simulated entities” (Chen, 2005, p. 1). Data is an integral part of the culture in the contemporary information society. Human are confronted with the flood of data every seconds, such as World Wide Web, cellular calls, networks of smart home applications (Pedrycz, 2005). Data is raw and simply exists. Data “has no significance beyond its existence (in and of itself). It can exist in any form, usable or not. It does not have meaning of itself” (Bellinger et al., 2004).

There are two types of data, which are primary data and secondary data (Agarwal, 2006). If the data are “collected from the units or individual respondents directly for the purpose of certain study of information” (Agarwal, 2006, p. 3), it is called primary data. For example, the primary data can be the data collected from a census study or an interview about enquiry made from tax payers. Many examples can be found in research, in particular in scientific study. The primary data can be numbers and people, for example, when “an experiment is conducted to know the effect of certain fertilizer doses on the yield or the effect of a drug on the patients, the observations taken on each plot or patient” are the primary data (Agarwal, 2006, p. 3). Agarwal (2006) defines the secondary data as the information contained in data, which “used to again from records, processed and statistically analyzed to extract some information for other purpose” (p. 3). It emphasizes that the secondary data is collected and statistically purposed by certain people or agency. Therefore, the secondary data can be obtained from year books, census reports, research reports or survey reports for the scientific studies. In this sense, the primary data can be considered as raw data and the secondary data can be considered as information.

In this book, “data” is defined as it is used in the fields of computing science and information science, as “unprocessed information” in the form of numbers or binary codes, which are the quantities stored and transmitted as electrical signals in computers. In this sense, data can be measured, explored, manipulated or retrieved. The data discussed in this book includes both primary data and secondary data.

Data and information have a very close relationship as data is usually defined as “unprocessed information” and information considered “processed data” (Hey, 2004; Zins, 2007). Data itself has no value, significance or meaning until it has been processed into a form of information that can be understand by humans (Bellinger et al., 2004; Bernstein, 2011; Zins, 2007). Many different forms have been used to represent data. It depends on whether the data is geometric or symbolic, or whether it is static or dynamic. It also depends on the kinds of purposes the data visualization will serve.

Similarly to data, the definition of “information” depends on the discipline and the context. In information science, the term normally refers to “processed data” that has meaning and can be understood by humans (Ackoff, 1989; Davis & Olson, 1985; “How to define data, information and knowledge,” n.d.). Information is considered “collections of data” (Zins, 2007, The Panel’s Definitions section, para. 32) that have been “collected and processed into numbers, artificial and natural languages, graphic objects that convey significance and meaning” (Zins, 2007, The Panel’s Definitions section, para. 32). Information can be data that “are processed to be useful, providing answers to ‘who’, ‘what’, ‘where’, and ‘when’ questions” (Chen, 2005, p. 1). Bellinger et al. (2004) argue that

information is data that has been given meaning by way of relational connection. This “meaning” can be useful, but does not have to be. In computer parlance, a relational database makes for information from the data stored within it.

Therefore, information is considered as useful data, which people can recognize and understand it. Chen (2005) emphasizes that “data that represents the results of a computational process, such as statistical analysis, for assigning meanings to the data, or the transcripts of some meanings assigned by human beings” (p. 1).

2.1.3 The Forms of Traditional Data Visualization

To understand traditional data visualization, it discusses the main forms of traditional data visualization in this section. The different forms of data visualization present a different visual effect, which can help to identify the problems or issues. It is also important to select the types of data visualization for a variety of disciplines or research fields. Although information visualization and scientific visualization are two different research fields, the forms of visualization are certain similarity. The visual forms of two visualizations are often overlaps. Table 2.1 presents the main forms of data visualization. Those forms of representation are also called “visualization techniques” (Cawthon & Moere, 2007; Fayyad & Grinstein, 2002), which have been developed to optimize selecting and creating the most effective and efficient transformation of data into information (Senay & Ignatius, 1990).

Table 2.1 The forms of traditional data visualization. It shows the types and common forms of data visualization

As represented within the table, the common forms of data visualization refer to trees, scatter plots, charts, tables, diagrams, and graphs. Selecting forms for data visualization is dependent on the type of data. Bertin (1977) suggests that data visualization primarily deals with two fundamental types of data, including data values and data structures. Ware (2012) proposes a similar suggestion that data can be split into entities and relations, in which entities refer to visualizing objects, and relations refer to structures and patterns relevant to entities. For example, people can be referred to as entities or a flock of birds can be referred to as a single entity. A house is built up by many different parts that have a structural relationship. Similarly, data usually has an attribute that is an emergent property of entities and may not exist independently. For example, the color of a car becomes an inseparable attribute of the car. Thus, traditional data visualization requires considering the type of data with regard to the entities, relation and its attributes.

Information visualization, which listed in Table 2.1, has become an emerging area in the last decade. It is considered as information design, which is “about the selection, organization and presentation of information to a given audience” (Wildbur & Burke, 1998, p. 6). Information design aims to design the efficient communication of information as its foremost task, which requests the content needs accurate and unbiased in the visual presentation. It attempts to present all the objective data required to enable the users to make some kind of decision (Wildbur & Burke, 1998). According to Wildbur and Burke (1998), information visualization can be categorized as three parts. The first refers to information as an organized arrangement of facts or data. For example, a timetable, a signing system and most maps fall into this first category. This organized information can let users freely to extract only that information which they need for a given purpose. The second includes “information presented as a means of understanding a situation or process, such as a guide-book, a bar-chart or a stage-by-stage description of how to get a machine to operate” (Wildbur & Burke, 1998, p. 7). The third includes the design of control systems. For example, an input or feedback controls of a product. The common forms for information visualization involve Tables, Charts, Trees, Maps, Scatter plots, Diagrams, Graphs, as indicated in the table.

In scientific visualization, as indicated in the table, the forms of data visualization often refers to simulations, waveforms or volume, in which data is explored, transformed and represented as an image or simulation in order to obtain insight into the phenomena. It is quite common for scientific visualization to use 3D modelling to represent scientific data because data can possess various dimensions (Ware, 2012). In fact, there are many different types of forms for scientific visualization. There are more specific forms of data visualization, such as area chart, bar chart, bullet graph, Box-and-whisker Plots, Bubble Cloud, Cartogram, Circle View, Dot Distribution Map, Gantt Chart, Heat Map, Highlight Table, Histogram, Matrix, Network, Polar Area, Radial Tree, Streamgraph, Text Tables, Treemap, Wedge Stack Graph and Word Cloud (Data visualization beginner’s guide: a definition, examples, and learning resources, n.d.). These forms of data visualization are not common used in information visualization or scientific visualization. Table 2.2 presents the specific forms of data visualization along with the common forms, which have already listed in Table 2.1.

Table 2.2 The forms of data visualization. It lists the common and specific forms of data visualization

By considering previous research, the common forms of scientific visualization will be identified and discussed in the following section (summarized in visualization section of Table 2.1).

Trees

Trees are considered one of the earliest visual illustrations of human thought systems (Lima, 2011). Its hierarchical structure has a great advantage in organizing, rationalizing, and illustrating information patterns and provides a key to interpret “evolving complexities of human understanding, from theological beliefs to the intersections of scientific subjects” (Lima, 2011, p. 21). Humans often use trees as a symbol to represent classification of the natural world (Lima, 2011). In visualization, tree can also refer to “Tree-Map” visualization (Johnson & Shneiderman, 1999). The hierarchical structures involve two types of information, which include structural information correlated hierarchy and content information correlated each node. For example, the branched schema of trees has been used for organizing family ties, social structures, and species evolution. The Tree of the Two Advents by Joachim of Fiore in 1202 is an example of trees that illustrates key characters and institutions of the Christian salvation history. This tree displays key figures from top to bottom, repeating Jesus Christ, Ozias the Prophet, Jacob the Patriarch, and Adam. The image of Christ dominates the whole trees and the lower branches link the image of Jacob to the twelve tribes of Israel. The Christ in the middle links twelve branches that represent twelve churches (Lima, 2011).

Figure 2.2 presents a Tree-Map of architecture history in 1896. The Tree-Map was produced as a schematic diagram with what Banister Fletcher recognized as the tree branches of architectural styles. It was published in the first edition of Fletcher’s Architecture book. The author highlights a cross-cultural and historical evolution of architectural styles through a series of successive branches (Flood, 2012). The Tree-Map shows the evolution of the various architectural styles in five periods including Peruvian, Egyptian, Greek, Assyrian and Chinese and Japanese. At the base of the tree trunk is a number of individual architectural styles and the crown of the tree several styles are repeated. It presents an overview of the entire hierarchy and makes the navigation of each node much easier.

Fig. 2.2
figure 2

The Tree of Architecture (1896), from the book of A History of Architecture on the Comparative Method. London: Athlone Press, University of London. (Image courtesy Wikimedia Commons)

Gülsüm Baydar Nalbantoglu provided her comments on the diagram of The Tree of Architecture as the following:

The “Tree of Architecture” has a very solid upright trunk that is inscribed with the names of European styles and that branches out to hold various cultural/geographical locations. The nonhistorical styles, which unlike others remain undated, are supported by the “Western” trunk of the tree with no room to grow beyond the seventh-century mark. European architecture is the visible support for nonhistorical styles. Nonhistorical styles, grouped together, are decorative additions, they supplement the proper history of architecture that is based on the logic of construction. (Morrison, 2015)

The visual metaphor of the tree is often used to represent a network of nodes that consists of root nodes and subordinate note within its branched schema in the field of data visualization. Trees are one of the most useful forms to depict data by starting from a root node and then tracing down to its subordinate node until all nodes have been explored (Spence, 2007). Trees can take various forms, such as TreeMap, SpaceTree, and StarTree, suitable for different data or information (Cawthon & Moere, 2007). The forms of trees have been used in many fields to represent data, including business and healthcare, amongst others. Spence (2007) presents a typical example of a tree map, which effectively displays a large amount of data for identifying patterns and tendencies (see the Status of Companies in Different Fields in the website of Smartmoney.com). Figure 2.3 shows a tree map that uses the hierarchical structure of the form of a tree—the roots, trunk, branches, but applies different visual forms. It presents clear information from financial markets that provide investors an instant overview of market activities.

Fig. 2.3
figure 3

The Smartmoney.com, (n.d.), The status of companies in different fields

Maps

A map is considered as a graphical depiction highlighting relationships between elements of space (objects or regions). The term of “map” comes from the Latin “mappa mundi”, in which “mappa” refers to cloth and “mundi” means the world. Therefore, a map can be defined as a 2D representation of the surface of the world. In fact, a map can present any space without consider context. For example, a DNA mapping or computer network topology mapping. However, a map often refers to geographic maps, which were used to help people define, explain and navigate through the world. As one of the most important inventions in human history, maps has been made and developed over centuries.

Cartography is a term that study crafting representations of the earth upon a flat surface. Cartography refers to the production and study of maps. The term comes from French cartographie, which was based on Latin carta. According to the International Cartographic Association (ICA), a map refers to “a symbolized image of geographical reality, representing selected features or characteristics” (Crampton, 2001, p. 240). This definition has clearly presented that the symbols need to match to the referents. The visual variables, which “use discrete symbols (e.g., choropleth maps) to show discrete data such as sales tax rates, and continuous symbols (e.g., isarithmic maps) to show continuous data such as temperatures.”, have been used to “a set of map graphic building-blocks which match spatial phenomena” (Crampton, 2001, p. 240).

Robinson (1952) argues that using more scientific approach to cartography, which focuses on “function provides the basis for the design” (p. 13). Maps are considered as map communication model (MCM) in the opinion of Robinson’s review of cartography. He argues that “a map maker makes the map created by a cartographer who is supposed to be sensitive to the capabilities of his envisaged map reader” (Robinson, Morrison, & Muehrcke, 1977, p. 6). Robinson et al. (1977) further emphasizes that “corollaries of this view are a lessened concern for the map as a storage mechanism for spatial data and an increased concern for the map as a medium of communication …” (p. 6). This statement highlights a separation between the cartographer and the map users. The users can communicate information from the cartographer through maps. The MCM was considered to promote “the development of a philosophical and conceptual framework in cartography …” (Andrews, 1988, p. 185).

The navigational maps are the most popular maps including road map, aeronautical and nautical charts railway or subway network maps, and even hiking or bicycling maps. For example, Google map is a web-based mapping service, which was developed by Google Company. It provides a range of services, including satellite imagery, aerial photography, street maps and so on. Most maps of the worlds are designed as four main maps, including political map that can show territorial borders of each country or city; physical map that presents features of geography, such as mountains, land, rivers, and also including roads, railways and buildings; topographic map that presents elevations and relief with contour lines; and geologic map that has a special purpose to display various geological features, which show rock units, bedding planes, structural features and plunge symbols. These different types of maps can service to different purposes. For example, a geological map can be used for geological exploration, while a political map may help people making plans to travel. Figure 2.4 shows a map of ancient Greece, which depicts the land of Greece surrounding by seas. It uses scale and coordinative techniques to present the geographical information in 1752.

Fig. 2.4
figure 4

Map of Ancient Greece, Graecia Vetus (Macedonia, Thessaly, Epirus, Achaia, Peloponnesus) in 1752

Maps were designed to static and fixed on paper or durable materials. The invention of computer technology facilitates the development of design of maps, such as dynamic and interactive maps. One of most important improvement use a variety of computer programs to generate a new form of maps. A basic function is interactive, which allows users to zoom in and zoom out the map. Users can easily search the name of locations they want to travel to in portable map device, which is known as a GPS (Global Positioning System).

Geographic Visualization (GVis) which uses the map’s power to explore, analyze and visualize spatial datasets to understand patterns. These developments are key components of a “maps as social constructions” approach, emphasizing the genealogy of power in mapping practices, and enabling multiple, contingent and exploratory perspectives of data. Figure 2.5 presents real time flight information in Australian airports during Covid-19 coronavirus outbreak in the early January 2020. It shows that much less flights over Australian sky than that before. It is because strict restrictions on travelling cause the large number of flights being cancelled.

Fig. 2.5
figure 5

Map of live flight tracking, a screenshot from flight aware website

Scatter Plots

A scatter plot is considered as one of the earliest and most widely used data visualizations, which is based on the Cartesian coordinate system (Ward, Grinstein, & Keim, 2010). It is a very useful form when multivariate data is presented with x or y coordinates (Cleveland, 1993). Each plot presents data in 2D or variables. A scatter plot is critically important in scientific research as it often presents data supporting important findings. Research from Weissgerber, Milic, Winham, and Garovic, (2015) emphasizes the importance of a scatter plot in scientific publication. They found that a scatter plot can enhance the understanding of published data for readers and “the increased flexibility of univariate scatterplots also allows authors to convey study design information” (Weissgerber et al., 2015). It is very flexible used in small sample size studies.

For example, Fig. 2.6 consists of a vertical scale and a horizontal scale that emphasizes that each of bivariate data (data for two variables) depends on the other, e.g. how Babinet points (the vertical axis) (the neutral points lie between 15° and 20° above the sun) depend on concentration (the horizontal axis) (Cleveland, 1993).

Fig. 2.6
figure 6

An exploratory scatter plot graphs the Babinet point against particulate concentration

Charts

Charts are the useful forms to represent data in line, bar, or slice forms. Traditionally, charts represent data in the form of a table, graph, or diagram. For example, Fig. 2.7 presents a chart, which was published in the New York Times in 1976, displaying information about the budget expenditures in localities in New York State, in which data is organized as bars that display information by their sizes. This form may be very useful in representing quantitative data that convey information about sizes or dimensions.

Fig. 2.7
figure 7

New York state total budget expenditures and aids to localities, New York Times, (1976)

As an example of visually representing information using different visual characteristics, Tufte (1997) presents a chart of History of O-Ring Damage in Field Joints, in which each bar of the graph in a chart as a small rocket that reflects a popular, recognized object rather than abstract forms (bars, lines, and triangles). This chart is effectively showing information about temperature and damage, which could help to facilitate possible insights into the data.

Another pioneer of exploring data visualization is Francis Galton, who made many contributions to data visualization and statistical graphics. It is well known for his role to develop the ideas of correlation and regression in graphic representation. He is also well known for his works on weather patterns began in 1861, which is shown in Fig. 2.8. The chart presents barometric pressure, wind direction, rain and temperature on afternoon and evening on every day in December 1861. Each day in the chart, “the 3 × 3 grid shows schematic maps of Europe, mapping pressure (row 1), wind and rain (row 2) and temperature (row 3)” (Friendly, 2008, p. 16). The chart presents weather information clearly, which people can easily understand “the series of black areas (low pressure) on the barometric charts for about the first half of the month, corresponding to the counter-clockwise arrows in the wind charts, followed by a shift to red areas (high pressure) and more clockwise arrows” (Friendly, 2008, p. 16).

Fig. 2.8
figure 8

The multivariate weather chart of Europe, Galton (1863); it shows barometric pressure, wind direction, rain, and temperature for afternoon and evening on each day during the month of December, 1861

Fig. 2.9
figure 9

Province of ONTARIO-vital statistics, Hardy (1887)

Tables

A table is one of the most common forms to display data within many fields. The data table can be an efficient form to comparative data on categorical objects. It usually uses rows and columns to represent data in two dimensions in the meaningful way (Harris, 1996). The quantitative data are placed in the square sited at the intersection of the row and column (named Table cells).

Traditionally in a table columns are used to represent dimensions or fields, and rows represent the actual data. It is an easier and simple visual form to represent data or dataset.

Compared to other visual forms, tables have a number of advantages including that they are “one of the best ways to convey exact numerical value” (Harris, 1996, p. 387), and “assist the viewers in making comparisons, determining how things are organized, noting relationships between various sets of data” (Harris, 1996, p. 387). Moreover, it can organize information where graphing might not be appropriate to the representation (Harris, 1996). Rows and columns of the table can use words, numbers or symbols.

Figure 2.9 is a typical table to present the data of the population, number of death and ratio to population in Ontario in 1886. On the left side of the table, there are lists of counties, which include all county in Ontario in Canada. The columns on the right of the table are information presented six categories, which are “population in 1886”, “number of deaths”, “ratio to population”, “phthisis”, and “anemia”. The data presented in the table is clear and accurate. People are able to easy learn the number of population and number of death from phthisis in each county in 1886.

A timetable is a type of table. It has been used to referring and managing tool for scheduled events, tasks and actions or appointment. The data is organizing within a table into chronological or alphabetical order helps users for quicker referencing. Timetables are basic way used to show the arrival and departure time, such as buses, trains and other forms of transportation. It is also an ideal tool for individual use for time management.

Diagrams

A diagram can be used in business-system analysis that consists of rectangles and lines illustrating information flow. For example, The Diagram Displays a Relationship and Information Flow for a Business System Analysis by Gerald Lohse emphasizes that the addition of background shading could enhance the understandability of an image by differentiating groups of information from one another (Keller and Keller, 1993). Figure 2.10 presents two statistical diagrams of the wholesale of poultry and game at Halles Centrales. Color coded reference and legend are included in the diagram. It presents the different poultry and the price in each month in 1888. The color of each column presents six poultry (chicken, pigeons, ducks, geese, turkeys and guinea fowl) and readers can understand the price of each animal during the months in 1888.

Fig. 2.10
figure 10

Atlas de statistique graphique de la ville de Paris. I. Annee (1888)

Before the computer technology is invented, most of graphs are competed by hand drawing, such as Fig. 2.11. Although it can accurately present the data or information in form of graph, it can be very difficult to handle a large amount data, such as internet data flow. Figure 2.9 shows social network data visualization through computer software.

Fig. 2.11
figure 11

Social network visualization published in Grandjean (2015)

Graphs

A graph is similar to a diagram. In fact, the terms graph, diagram and chart are similar and can overlap. Graphics may have many purposes for presenting data. For example, a graph can help people to perceive and recognize the broad features of the data or information. It also can help people to look what are behind the data through the broad features (Anscombe, 1973). A graph may not present accuracy of data; however, it can provide the insight of data.

There is a classic example of graphs Interest of the National Debt from the Revolution by William Playfair in 1786 representing the British government’s debt interest, from 1688 to 1784 (Tufte, 2001, p. 65). The use of the graphic clearly depicts the skyrocketing debt. Another example was the first case of cholera outbreak in Great Britain in 1831. The epidemic caused 52,000 people died within eighteen months. The water-born cause of the disease was not found until Dr. John Snow drew a famous map in 1855. It was considered as a landmark graphic discovery (Fig. 2.12).

Fig. 2.12
figure 12

The map of Leeds (1833), showing the districts affected by cholera

Waveform

Waveform is a common form that is used in visual representation of the form of a signal, which refers to the shape of the time related to various quantities of data. It is often used in medicine, engineering, or earth science fields. For example, EEG data visualization is a typical waveform that is widely used in medical related research. Visualizing the stages of Non-Rapid eye movement sleep can be an example of waveform, which details the shape of a waveform with frequencies and bands showing electrical activity over time (Carskadon & Dement, 2011). There has an example of waveform, which has three waveform representing different studies. Figure 2.13a presents a peripheral waveform (e.g., from the brachial artery); Fig. 2.14b presents a waveform that is analyzed by a mathematical algorithm based on a generalized transfer function; a modulus amplification and phase lag characteristics are presented; Fig. 2.15c presents a central waveform that is reconstructed. From this waveform, the information about the central systolic (SBP) and diastolic blood pressure (DBP), and pulse pressure (PP) are derived. The central arterial systolic pressure (CASP) is defined in the peak. The information has been effectively displayed in the waveforms.

Fig. 2.13
figure 13

An examples of peripheral waveform (a) (2016)

Fig. 2.14
figure 14

An examples of generalized transfer function (b) (2016)

Fig. 2.15
figure 15

An examples of central waveform (c) (2016)

A waveform is often used to visualize the data generated by earthquakes that are recorded and displayed as a seismic-wave form (Keller & Keller, 1993, p. 72). The waveform is an effective way to represent the magnitude of an earthquake. In Fig. 2.16, the horizontal earth movement is represented by dark color and the bigger waveforms present bigger magnitude of an earthquake. The left column presents the time in a day. This waveform is recorded in a station Jajag in Indonesia in 2018. It has effectively presented the data of earthquake magnitude and people can easy understand the size of magnitude.

Fig. 2.16
figure 16

The seismic waveform which recorded by GFZ German Research Center for Geosciences (2018)

Simulation

Simulation is one of most useful forms of scientific visualization. One of the important advantages of simulation is that it enables viewers to understand and describe natural phenomena such as weather changes or the role of dark energy in the universe (Ahrens et al., 2010). Computer technology (CT) has been used to simulate natural phenomenon, such as a thunderstorm, global climate change (Thalmann, 1990; Rosenblum, 1995). Figure 2.17 is an example that a computer simulates climate, which depicts the carbon dioxide from various sources that are adverted individually as tracers in the atmosphere model. Carbon dioxide from the land is shown as plumes during 1900. This visualization adopts a three-dimensional model to simulate the climate, which provided a realistic environment for effectiveness.

Fig. 2.17
figure 17

Climate visualization of atmosphere (2007). (Image courtesy of Forrest Hoffman and Jamison Daniel of Oak Ridge National Laboratory)

Simulation is one of popular visualization of weather and climate related data. It creates the dynamic change of climate change. For example, NASA (The National Aeronautics and Space Administration) simulates a climate warming, which is shown in Figs. 2.18 and 2.19. This data visualization shows how global temperatures have risen from 1950 to the end of 2013. Figure 2.18 shows that the color in ocean area is almost blue and color in land is yellow and blue in the year of 1950, but it has changed to yellow and red throughout continents and much of ocean in the year of 2013 (Fig. 2.19).

Fig. 2.18
figure 18

A screen shot of climate change in 1950, it was created by NASA

Fig. 2.19
figure 19

A screen shot of climate change in 2013, it was created by NASA

It is apparently that simulation can effectively present the nature phenomenon in a dynamic model, which has a significant different from other types of data visualization. In a simulation, people can see a process of the change throughout a certain period time. People may gain the insight or new finding from the simulation visualization.

Volume

Volume visualization is considered one of the most important forms of scientific visualization. It creates graphical representation of data set defined on 3D grids. Volume data set are multidimensional arrays of scalar data and vector data, which are typically defined on lattice structures representing values sampled in 3D environment (Knupp, 1999). Two basic types of volume data are scalar data and vector data (Bürger & Hauser, 2007). Scalar data contains single values for each point while vector data contains two or three values for each point, defining the components of a vector. As the development of 3D data acquisition field, the volume dataset grows fast. Volume data can be captured by many technologies, such as CT or MRI scan technology. Volume visualization has been explored in many domains and it is often used in representing medical related data, such as functional MRI data from scans, data from confocal microscopy (Kaufman, 1996).

Volume data visualization can be divided as five types, which are slice-based approach, emulation of other technology, volume rendering, indirect volume rendering and direct volume rendering (Tukalo, n.d.). Slice-based approach is a simple way to implement and has low computing complexity, however, the disadvantage of this visualization is that users need to restructure the whole object by the imagination as it only visualize a part of the object (Tukalo, n.d.). Most experts prefer to use emulation of other technology for visual analyses, as “the emulation allows experts to have a smooth transition to a modern technic from their old solutions” (Tukalo, n.d., p. 1). Volume rendering, the technique used in volume visualization, focuses on rendering a 3D data set into a 2D visualization. Indirect volume rendering has all typical features of 3D objects. Thus, this technology allows complex 3D analysis simpler. For direct volume rendering, it is considered as a powerful way to visualize volume data, as it has almost all the advantage of polygonal mesh models. It also can combine these models on the same scene (Tukalo, n.d.).

For example, Fuchs, Levoy, and Pizer (1990) create a 2D image of a human skull, which is rendering by a CT scanner. It demonstrates how volume rendering is used to present a tumor in purple in a human head and a radiation treatment beam in blue using ray tracer technique. There is another example of volume rendering in Fig. 2.20, which is a visualization of inbound traffic measured in billions of bytes on the NSFNET T1 backbone for the month of September 1991. The volume range is displayed from purple (0 bytes) to white (100 billion bytes). It represents data collected by Merit Network Company.

Fig. 2.20
figure 20

Regional Networks Traffic in 1991. (Image courtesy of Merit Network, Inc., NCSA, and the National Science Foundation)

The forms in Table 2.1 could be used in both information visualization and scientific visualization. Data and dataset in both information visualization and scientific visualization can be different. That selecting which type of visualization tools depends on the purpose of visualization. The forms of tables and charts can be found in both visualization fields. However, some forms are only commonly used in one or the other. For example, simulation is one of the main forms used to represent phenomena in scientific visualization but it is seldom used in information visualization. Many data visualization techniques adopt hybrids of various forms, such as the combination of a chart and graph. The forms used in Table 2.1 are the common techniques for data visualization. Although the visual effects differ, all forms have similar characteristics, which emphasize the represented data having readability, recognizability and meaning (Kosara, Drury, Holmquist, & Laidlaw, 2008). In the next section, it will discuss the characteristics of traditional data visualization listed in the Table 2.1.

2.1.4 The Characteristics of Traditional Data Visualization

There are various visualization techniques are available for design a good data visualization, which can help people produce something with important insights. To create good data visualization, it should firstly understand what the common characteristics of traditional data visualization are. As data visualization as a topic has received attention in many fields in the last decade, it requires high skills in storing, managing and analyzing huge data flow. However, it also needs the ability to visualize data effectively for communication (Yarmuluk, 2019). Table 2.3 outlines the key characteristics of traditional data visualization, which can be used in both information visualization and scientific visualization.

Table 2.3 The characteristics of traditional data visualization

The review of the literature suggests the common characteristics of traditional data visualization including readability, recognizability and meaning. These characteristics are based on data communication for comprehension and knowledge acquisition. Data visualization may has more characteristics, such as graphical design, color and interactivity, but the essential and common characteristics focuses on readability, recognizability and meaning. The selecting of visual forms in Table 2.1 for the transformation of data to information facilitates these characteristics.

Firstly, data visualization should be readable. Readability is one of the most important characteristics of data visualization, in which the visual representation is intelligible and can be understood easily. Secondly, users are able to recognize visualizing data. Recognizability is another important characteristic of data visualization that allows users to identify previous knowledge from the visual form. Lastly, data visualization should convey information people could understand. The visual representation of data assists users in understanding the meaning of data, which is an important characteristic for both information visualization and scientific visualization.

2.2 Data Visualization and Human Perception

Human perception plays an important role in data visualization. Ware (2012) suggests that perception can remarkably improve both the quality and the quantity of information being displayed. Perception refers to the way sensory data is organized, interpreted and consciously experienced. Also, perception can be defined as “the process of recognizing (being aware of), organizing (gathering and storing), and interpreting (binding to knowledge) sensory information” (Ward, Grinstein, & Keim, 2010, p. 81). It is clearly that perception links to humane senses that produce signals from the environment through five senses including sight, hearing, touch, smell and taste. Ward et al. (2010) explain the notion of perception as the following:

Simply put, perception is the process by which we interpret the world around us, forming a mental representation of the environment. This representation is not isomorphic to the world, but it’s subject to many correspondence differences and errors. The brain makes assumptions about the world to overcome the inherent ambiguity in all sensory data, and in response to the task at hand. (p. 82)

This explanation emphasizes that human perception is a subject mental activities, which interpret data or information we received around the environment.

When quantitative data is presented in a graphical form, the users are asked to use their visual perception to make judgements about the visual form, to compare relative sizes, locations, orientations, colors, densities, textures of the elements of the visualization. In fact, human visual perception is a very complicated and subjective process, and that the effectiveness of the visualization for conveying objective understanding based on a wide range of subtle factors (Reuter, Tukey, Maloney, Pani, & Smith, 1990).

2.2.1 The Perceptual Process

In psychological study, researchers and scholars have been looked at how the human visual system perceives and analyses images for many years. One of key findings from the study is that “the discovery of a limited set of visual properties that are detected very rapidly and accurately by the low-level visual system”, which was named as “preattentive” (Healey, 2007). The term preattentive is related to attention. Healey (2007) investigates four theories of preattentive processing, which are feature integration theory, texton theory, similarity theory, and guided search theory.

Feature integration theory:

  • It is a theory of attention developed by Anne Treisman and Garry Gelade. Treisman provided an important insight into the preattentive processing (Healey, 2007). Firstly, Treisman tried to determine which visual properties are detected preattentively, which were named as “preattentive features” (Trick & Pylyshyn, 1994). Secondly, Treisman formulated a hypothesis about how the human visual system performs this processing (Rheingans & Tebbs, 1990). Treisman suggests when human perceives a stimulus, features are “registered early, automatically, and in parallel, while objects are identified separately”. It has been regarded as one of the most influential psychological models of human visual attention.

Texton theory:

  • Bela Julész studies the texture patterns and attempts to determine if variations in a particular order statistic were seen by the human low–level visual system. He suggests that a group of features detected in the early visual system was named as “textons”, which can be categorized into three parts:

    1. a.

      Elongated blobs: specific properties such as hue, orientation, and width. For example, line segments, rectangles, ellipses;

    2. b.

      Terminators: ends of line segments;

    3. c.

      Crossings of line segments;

It is believed that only a difference in textons or in their density can be detected preattentively (Healey, 2007).

Similarity theory:

  • This theory involves search time for perception. Some researchers investigated conjunction searches by focusing on two elements: (1) Search time may depend on the number of items of information required to identify the object. (2) It may depend on how easily an object can be distinguished from its distractors, regardless of the presence of unique preattentive features (Healey, 2007). Search time is based on two criteria: T-N similarity and N-N similarity.

Guided search theory:

  • A new visual search theory was proposed by Jeremy Wolfe. He coined it as “guided search”. Guided Search Theory is a two-stage model of visual processing in which initial parallel search mechanisms direct subsequent serial search mechanisms (Wolfe, 2014). One example of the Guide Search Theory is the process of visually diagnosing for red circles within green circles as well as red shaped squares. The theory suggests that there are two sensory scanners, one which is sensitive to the color red, while the other only detects circular shapes. The information from the two detectors amalgamates the collected information and figures out the target through analyzing for the most fitted suspect. If the target is incorrectly identified, the next most suitable candidate will be chosen. (Wolfe, 2014).

Research on perception is also involves illusion, which is considered as a distortion of the senses. As the researchers attempt to find out how the human brain normally organizes and interprets sensory stimulation. This book does not attempt to address the theories of preattentive processing further, but it provides readers a sense of perception theories for understanding human visual system to data visualization.

The human perception involves signals that go through the nervous system, which in turn result from chemical stimulation of the sensory system. For example, vision involves light striking the retina of the eye; smell is mediated by odor molecules; and hearing involves pressure waves. Data visualization focuses on vision perception, which is the primary human sense. To understand how human to perceive the images can help people to understand the importance of data visualization in our life and society.

2.3 Conclusion

In summary, the forms and characteristics of traditional data visualization emphasizes comprehension of data by using related visual forms to represent information. Data visualization has many visual forms. The main visual forms include Trees, Scatter Plots, Charts, Tables, Maps, Diagram, Graphs, waveform, simulation and volume. Most of these visual forms can be used overlap. It emphasizes that good data visualization have the characteristics of readability, recognizability and meaning. This chapter also discusses how human visual system perceives data and information. It reviews that the definition of human perception and some important theories of preattentive processing, including feature integration theory, texton theory, similarity theory and guided search theory.

However, these visual forms and characteristics do not necessarily involve aesthetic considerations or approaches to design visualization. Data visualization has ignored the contribution of aesthetics to enhancing understanding or insight. Nevertheless, aesthetics has more recently been identified as having value to data visualization. This reflects two aspects that include the relationship between aesthetics and usability in the design of human-computer interaction (HCI) as well as visualization considered as artistic creation that emphasizes the value of visualization “in its own right and for its own purposes” (Van Wijk, 2006, p. 428). The value of aesthetics within the HCI field was first introduced by Kurosu and Kashimura in 1995 (Tuch, Roth, Hornbaek, Opwis, & Bargas-Avila, 2012), demonstrating the importance of aesthetics in creating an interface that “significantly influences user’s perceived ease of use of the entire system” (Tuch et al., 2012, Introduction section, para 2). Visualization as art, which gained attention during the early twenty-first century, positions the value of aesthetics as not only concerning images and visual representations, but also ideas, methods, and techniques. Van Wijk (2006) argues that in order for “aesthetic criteria for new methods to be effective guides…each link of the chain from idea, mathematical model, algorithm, implementation, to visual result is clean, simple, elegant, symmetric” (p. 428).

The next chapter will address how aesthetics underpins data visualization. It includes recent research on aesthetic approaches to data visualization and the contributions of aesthetics. Chinese aesthetic concepts and Kantian sublime will be explored in visualizing data.