3.1 OLAP Visualization
The OLAP tools enable users to analyze multidimensional data interactively from various perspectives. OLAP consists of five basic analytical operations: consolidation (roll-up), drill-down, pivoting, slicing and dicing [14]. The applied operations change the selection of the visualized components on the fly and update the view according to the user’s actions.
A traditional interface for analyzing OLAP data is a pivot table, or cross-tab, which is a multidimensional spreadsheet produced by specifying one or more measures of interest and selecting dimensions to serve as vertical (and, optionally, horizontal) axes for summarizing the measures [9]. Pivot tables are a widespread visualization model, which provide a detailed data presentation to users that are familiar with it. Additionally, they can be easily transformed to serve additional dimensions of information through the application of color-coding or rows/columns merging. However, the tables’ efficiency declines for larger data sets, as users are unable to locate specific information, recognize patterns or get an overview of the displayed data sets (Fig. 1).
Another useful visualization component are parallel coordinates [15, 19], which allow the display of multiple data dimensions on a 2D plane. The concept of parallel coordinates is the concurrent visualization of different values in a row, one after another. For instance, in the case of car comparison, car values regarding horsepower, fuel consumption, car dimensions, acceleration, etc., would be meaningful to be displayed on the same plot. This visualization technique holds the advantage of correlation between different data dimensions, e.g., horsepower and acceleration, which could depict the impact of one dimension on another. However, this applies only in the case where the dimensions are rendered adjacently; if another dimension intervenes between them, the visual clue disappears and the correlation is impossible to discover. Additionally, parallel coordinates are unable to visualize non-numerical values and can be difficult to comprehend for non-expert users (Fig. 2).
In order to address the aforementioned drawbacks of tables and parallel coordinates, techniques such as pie charts, plots, graphs and trees are employed by state of the art commercial tools. The leading software for Big Data Visualization, Business Intelligence and Analytics is Tableau [24], while other approaches include qlikview [29], Microsoft’s Power BI [28] and others.
Tableau [24] provides default tools for rendering each specific data type, which can be overridden by the user on demand. Each visualization is selected to optimally represent a specific data type: for instance, cross-tabs are used for the visualization of discrete categorical data and lines for continuous quantitative information.
Even though OLAP tools are very efficient for Big Data retrieval and on demand visualization, they usually lack exploratory functionality. Furthermore, due to the complex nature of the operations supplied, OLAP tools tend to be cumbersome for novice users to manipulate.
3.2 Big Data 3D Visualization
Traditional visualization techniques fall short in terms of efficient and intuitive display of the corresponding data sets; therefore, the need for a rich interactive visualization still constitutes both a business and a research challenge.
In order to fill in the gap of inefficient visualization, some approaches in literature involve 3D visualization for OLAP [2, 20]. However, 3D visualization approaches are not yet very popular.
The inclusion of an additional dimension to the visualization adds up to the efficiency of depicting the dimensions of Big Data. Since the human brain is trained to sense and act in three dimensions, it is optimized to perceive three dimensions in a natural manner. The third dimension constitutes an aspect of the rendered information easily perceivable by the user, thus enhancing potential exploration in a virtual world.
On the other hand, 3D visualizations have certain drawbacks. In general, they present a steep learning curve, as inexperienced users usually tend to be unable to orientate themselves in the virtual three-dimensional space or to manipulate the interface. The lack of perception of 3D space does not only involve the virtual space itself, but also the lack of the exploration and identification of the visualized information. Interaction complexity with 3D user interfaces is also a significant burden due to the additional degrees of freedom, requiring complex manipulation controls and a rich interaction vocabulary.
3.3 Virtual and Augmented Reality
Virtual and augmented reality environments form an emerging approach that is capable of providing Big Data visualizations. Such environments in the context of Big Data constitute interdisciplinary efforts, combining the areas of 3D graphics, stereoscopic environments, computer vision and Big Data querying. The main advantage of virtual and augmented reality environments is better user experience and immersion, which allows the better perception of the visualized geometry. Furthermore, in comparison to traditional 3D visualizations, the users perceive themselves in the context of the visualization and thus orientate themselves more easily.
Helbig et al. [17] use a virtual reality environment to visualize massive data in the context of Weather Research Forecast. Another interesting approach is the immersive visualization of a landscape in Mars [11], augmented with data describing the surface characteristics.
Future challenges on applying virtual and augmented reality to Big Data visualizations include multimodal interaction, display and equipment limitations [25]. Virtual and augmented reality is a growing research field, mainly due to the emergence of devices like the Microsoft Kinect and LeapMotion that provide more natural interaction based on gestures and Oculus Rift, which puts virtual reality in play again in terms of mainstream visualization technique. However, still several challenges exist towards incorporating and enhancing traditional 2D desktop approaches in virtual space as well as developing a suitable infrastructure at the side of Big Data to support additional needs that may rise.
3.4 Graph Visualization
Graphs are a common technique used for displaying the correlation between different entities. Their main advantage is the user’s ability of starting from a specific node and exploring neighboring nodes, especially when visualizing data sets that describe networks or relationships. The survey reported by Beck et al. [4] describes a trend towards the combination of graphs with interactive timelines in order to include potential temporal characteristics of the information.
Furthermore, graph visualizations simplify exploration by providing operations like sampling, filtering, partitioning and clustering [27], while they can also support several abstraction layers [6] in order to provide meaningful views according to the scope of the visualization, ranging from overview to detailed view.
Even though graphs can be very helpful in illustrating specific aspects of Big Data, they tend to focus only on one aspect of the data, which is the interconnection between the various nodes. Moreover, graphs are meaningful only if the data that they present are coherent, and are not suitable to illustrate other aspects such as comparison between data and temporal relationships.
3.5 Exploratory Data Analysis
Exploratory Analytics, or else Discovery Analytics, refers to the process of using visualization exploration techniques, in the context of Big Data, which aims at discovering new facts or characteristics of Big Data that users were previously unaware of [31, 34, 36]. Heer and Shneiderman [16] present a widely adopted taxonomy for interactive dynamics regarding visual analysis. The proposed taxonomy groups tasks in three high level categories: data and view specification, view manipulation and analysis process and provenance.
Faceted navigation, also mentioned as data and view specification [16], refers to the process of applying specific filters to the data sets provided, in order to focus on the subset of interest. Faceted navigation can combine multiple visualization techniques to apply the most suitable ones to the corresponding type. Such an example is EDEN [33], where the authors use parallel coordinates and geographic visualizations to interactively refine the displayed values and thus offer exploratory analysis of Big Data by exploring relationships between entities. Another example of interactive faceted exploration is discussed in [36], which combines automatically generated and manually specified visualizations in order to improve support for data exploration.