rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Bikakis, Nikos; Skourla, Melina; Papastefanatos, George

doi:10.1007/978-3-319-11955-7_37

Nikos Bikakis^7,8,
Melina Skourla⁷ &
George Papastefanatos⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8798))

Included in the following conference series:

European Semantic Web Conference

1960 Accesses
13 Citations

Abstract

The purpose of data visualization is to offer intuitive ways for information perception and manipulation, especially for non-expert users. The Web of Data has realized the availability of a huge amount of datasets. However, the volume and heterogeneity of available information make it difficult for humans to manually explore and analyse large datasets. In this paper, we present rdf:SynopsViz, a tool for hierarchical charting and visual exploration of Linked Open Data (LOD). Hierarchical LOD exploration is based on the creation of multiple levels of hierarchically related groups of resources based on the values of one or more properties. The adopted hierarchical model provides effective information abstraction and summarization. Also, it allows efficient -on the fly- statistic computations, using aggregations over the hierarchy levels.

You have full access to this open access chapter, Download conference paper PDF

LODmilla: Shared Visualization of Linked Open Data

Lumina: an adaptive, automated and extensible prototype for exploring, enriching and visualizing data

Article 04 January 2021

LDViz: A Tool to Assist the Multidimensional Exploration of SPARQL Endpoints

Keywords

1 Introduction

The purpose of data visualization is to offer intuitive ways for information perception and manipulation that essentially amplify, especially for non-expert users, the overall cognitive performance of information processing. This is of great importance in the Web of Data, where the volume and heterogeneity of available information make difficult for humans to manually explore and analyse large datasets. An important challenge is that visualization techniques must offer scalability and efficient processing for on the fly visualization of large datasets. They must also employ appropriate data abstractions and aggregations for avoiding information overloading due to the size and diversity of the data presented to the user. Finally, they must be generic and provide uniform and intuitive visualization results across multiple domains.

In this work, we present rdf:SynopsViz, a framework for hierarchical charting and exploration of Linked Open Data (LOD). Hierarchical LOD exploration realized through the creation of multiple levels of hierarchically related groups of resources based on the values of one or more properties. For example, a numerical group, characterized by a numerical range, comprises all resources with a property value within the range of this group. Hierarchical browsing can address the problem of information overloading as it provides information abstraction and summarization [1]. It can also offer rich insights on the underlying data when combined with rich statistical information on the groups and their contents.

The key features of rdf:SynopsViz framework are summarized as follows: (1) It adopts a hierarchical model for RDF data visualization, browsing and analysis. (2) It offers automatic on-the-fly hierarchy construction based on data distribution, as well as user-defined hierarchy construction based on user’s preferences. (3) Provides faceted browsing and filtering over classes and properties. (4) Integrates statistics with visualization; visualizations have been enriched with useful statistics and data information. (5) Offers several visualizations techniques (e.g., timeline, chart, treemap). (6) Provides a large number of dataset’s statistics regarding the: data-level (e.g., number of sameAs triples), schema-level (e.g., most common classes/properties), and structure level (e.g., entities with the larger in-degree). (7) Provides numerous metadata related to the dataset: licensing, provenance, linking, availability, undesirability, etc. The latter are useful for assessing data quality [13].

2 Framework Overview

The architecture of rdf:SynopsViz is presented in Fig. 1. Our scenario involves three main parts: the Client GUI, the rdf:SynopsViz framework, and the input data. The Client part, corresponds to the framework’s front-end offering several functionalities to the end-users (e.g., statistical analysis, facet search, etc.). rdf:SynopsViz consumes RDF data as Input data; optionally, OWL-RDF/S vocabularies/ontologies describing the input data can be loaded. Next, we describe the basic components of the rdf:SynopsViz framework.

In the preprocessing phase, the Data and Schema Handler parses the input data and inferes schema information (e.g., properties domain(s)/range(s), class/property hierarchy, type of instances, type of properties, etc.). Facets Generator generates class and property facets over input data. Statistics Generator computes several statistics regarding the schema, instances and graph structure of the input dataset, such as the number of different types of classes and properties, or the number of sameAs triples, or finally the average in/out degree of the RDF graph, respectively. Metadata Extractor collects dataset metadata which can be used for data quality assessment. Hierarchical Model Module adopts our hierarchy model and stores the initial data enriched with the information computed during the preprocessing phase.

During runtime the following components are involved. Hierarchy Specifier is responsible for managing the configuration parameters of our hierarchy model, e.g., the number of hierarchy levels, the number of nodes per level, and providing this information to the Hierarchy Constructor. Hierarchy Constructor implements the hierarchy model. Based on the selected facets, and the hierarchy configuration: it determines the hierarchy of groups and the contained triples, and computes the statistics about their contents (e.g., range, variance, mean, number of triples contained, etc.). Visualization Module allows the interaction between the user and the framework, allowing several operations (e.g., navigation, filtering, hierarchy specification) over the visualized data.

3 Implementation and Demonstration Outline

Implementation. rdf:SynopsViz is implemented on top of several open source tools and libraries. Regarding visualization libraries, we use Highcharts^{Footnote 1}, for the area and timeline charts and Google Charts^{Footnote 2} for treemap and pie charts. Additionally, it uses Jena framework^{Footnote 3} for RDF data handing and Jena TDB for RDF storing.

The web-based prototype of rdf:SynopsViz is available at http://synopsviz.imis.athena-innovation.gr. Also a video demonstrating the scenario presented below is available at http://youtu.be/8v-He1U4oxs.

Demonstration scenario. First, the attenders will be able to select a dataset from a number of offered real-word datasets (e.g., dbpedia, Eurostat, World Bank, U.S. Census, etc.) or upload their own. Then, for the selected dataset, the attendees are able to examine several of the dataset’s metadata, and explore several datasets’s statistics.

Using the facets panel, the attenders are able to navigate and filter data based on classes, numeric and date properties. In addition, through facets navigation several information about the classes and properties (e.g., number of instances, domain(s), range(s), IRI, etc.) are provided to the users through the UI.

The attenders are able to navigate over data by considering properties’ values. Particularly, area charts and timeline-area charts are used to visualize the resources considering the user’s selected properties. Classes’ facets can also be used to filter the visualized data. Initially, the top level of the hierarchy is presented providing an overview of the data, organized into top-level groups; the user can interactively zoom in and out the group of interest, up to the actual values of the raw input data. At the same time, statistical information concerning the hierarchy groups as well as their contents (e.g., mean value, variance, sample data, etc.) are presented.

In addition, the attenders are able to navigate over data, through class hierarchy. Selecting one or more classes, the attenders can interactively navigate over the class hierarchy using treemaps. In rdf:SynopsViz the treemap visualization has been enriched with schema and statistical information. For each class, schema metadata (e.g., number of instances, subclasses, datatype/object properties) and statistical information (e.g., the cardinality of each property, min, max value for datatype properties’ ranges, etc.) are provided.

Finally, the attenders can interactively modify the hierarchy specifications. Particularly, they are able to increase or decrease the level of abstraction/detail presented, by modifying both the number of hierarchy levels, and number of nodes per level.

4 Related Work

A large number of works studying issues related to RDF or LOD visualization and analysis have been proposed in the literature [2–5]. Additionally, numerous tools offering RDF or Linked Open Data visualization have been developed, e.g., Sgvizler [6], LODWheel [7], Payola [8], CubeViz [9], KC-Viz [10], RelFinde ^{Footnote 4}, Welkin ^{Footnote 5}, IsaViz ^{Footnote 6}, RDF-Gravity ^{Footnote 7}, etc.

In the context of RDF and Linked Open Data statistics, RDFStats [12] calculates statistical information about RDF datasets. LODstats [11] is an extensible framework, offering scalable statistical analysis of Linked Open Data datasets.

Regarding the quality assessment issues, [13] studies the criteria which can be used in Linked Data quality assessment. Reference [14] review millions of RDF documents to analyse Linked Data conformance. Finally, several frameworks for the quality assessment in the Web of Data, have been proposed LINK-QA [15], Sieve [16], WIQA [17]. In contrast to existing approaches, we provide hierarchical RDF data visualization enriched with data statistics. The hierarchical model solves the visualization overload issues, offering efficient, on the fly statistical computations over hierarchy levels. Finally, due to hierarchical model our tool can efficiently handle and analyse very large datasets.

5 Conclusions

In this paper we have presented rdf:SynopsViz, a framework for hierarchical charting and exploration of Linked Open Data. The hierarchical model adopted by our framework can address the problem of information overloading, offering an effective mechanism for information abstraction and summarization. Additionally, the adopted model allows the efficient statistic computations, using aggregations over the hierarchy levels.

Some future extensions of our tool include the application of more sophisticated filtering techniques (e.g., SPARQL-enabled browsing over the data), as well as the addition of more visual techniques and libraries.

Notes

References

Elmqvist, N., Fekete, J.-D.: Hierarchical aggregation for information visualization: overview, techniques, and design guidelines. IEEE Trans. Vis. Comput. Graph. 16(3), 927–934 (2010)
Google Scholar
Dadzie, A., Rowe, M.: Approaches to visualising Linked Data: a survey. Seman. Web 2(2), 89–124 (2011)
Google Scholar
Brunetti, J., Auer, S., Garcia, R.: The linked data visualization model. In: ISWC 2012 (2012)
Google Scholar
Dadzie, A.-S., Rowe, M., Petrelli, D.: Hide the Stack: toward usable linked data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 93–107. Springer, Heidelberg (2011)
Google Scholar
Alonen, M., Kauppinen, T., Suominen, O., Hyvönen, E.: Exploring the linked university data with visualization tools. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 204–208. Springer, Heidelberg (2013)
Google Scholar
Skjveland M.: Sgvizler: a JavaScript wrapper for easy visualization of SPARQL result sets. In: ESWC 2012 (2012)
Google Scholar
Stuhr, M., Dumitru, R., Norheim, D.: LODWheel - JavaScript-based visualization of RDF data. In: Workshop on Consuming Linked Data 2011 (2011)
Google Scholar
Klímek, J., Helmich, J., Nečaský, M.: Payola: collaborative linked data analysis and visualization framework. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 147–151. Springer, Heidelberg (2013)
Google Scholar
Salas, P., Mota, F., Breitman, K., Casanova, M., Martin, M., Auer, S.: Publishing statistical data on the web. In: IEEE Semantic Computing 2012 (2012)
Google Scholar
Motta, E., Mulholland, P., Peroni, S., d’Aquin, M., Gomez-Perez, J.M., Mendez, V., Zablith, F.: A novel approach to visualizing and navigating ontologies. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 470–486. Springer, Heidelberg (2011)
Chapter Google Scholar
Auer, S., Demter, J., Martin, M., Lehmann, J.: LODStats – an extensible framework for high-performance dataset analytics. In: Aussenac-Gilles, N., d’Acquin, M., Handschuh, S., Hernandez, N., Nikolov, A., Stuckenschmidt, H., Teije, A., Völker, J. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 353–362. Springer, Heidelberg (2012)
Chapter Google Scholar
Langegger, A., Wöß, W.: RDFStats - an Extensible RDF Statistics Generator and Library. In: Workshop on Web Semantics 2009 (2009)
Google Scholar
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment methodologies for linked open data. Under review, available at Semantic Web Journal site
Google Scholar
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker, S.: An empirical survey of Linked Data conformance. J. Web Sem. 14, 14–44 (2012)
Article Google Scholar
Guéret, C., Groth, P., Stadler, C., Lehmann, J.: Assessing linked data mappings using network measures. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 87–102. Springer, Heidelberg (2012)
Google Scholar
Mendes, P., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Workshop on Linked Web Data Management 2012 (2012)
Google Scholar
Bizer, C., Cyganiak, R.: Quality-driven information filtering using the WIQA policy framework. J. Web Sem. 7(1), 1–10 (2009)
Article Google Scholar

Download references

Acknowledgement

This research has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALIS and KRIPIS - Investing in knowledge society through the European Social Fund.

Author information

Authors and Affiliations

National Technical University of Athens, Athens, Greece
Nikos Bikakis & Melina Skourla
IMIS, ATHENA Research Center, Athena, Greece
Nikos Bikakis & George Papastefanatos

Authors

Nikos Bikakis
View author publications
You can also search for this author in PubMed Google Scholar
Melina Skourla
View author publications
You can also search for this author in PubMed Google Scholar
George Papastefanatos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikos Bikakis .

Editor information

Editors and Affiliations

ISTC-CNR, Rome, Italy
Valentina Presutti
Linköping University, Linköping, Sweden
Eva Blomqvist
EURECOM, Biot, France
Raphael Troncy
Hasso-Plattner-Institut, Potsdam, Brandenburg, Germany
Harald Sack
Ionian University, Corfu, Greece
Ioannis Papadakis
Elsevier B.V., Amsterdem, The Netherlands
Anna Tordai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bikakis, N., Skourla, M., Papastefanatos, G. (2014). rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds) The Semantic Web: ESWC 2014 Satellite Events. ESWC 2014. Lecture Notes in Computer Science(), vol 8798. Springer, Cham. https://doi.org/10.1007/978-3-319-11955-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-11955-7_37
Published: 16 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11954-0
Online ISBN: 978-3-319-11955-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Abstract

Similar content being viewed by others

LODmilla: Shared Visualization of Linked Open Data

Lumina: an adaptive, automated and extensible prototype for exploring, enriching and visualizing data

LDViz: A Tool to Assist the Multidimensional Exploration of SPARQL Endpoints

Keywords

1 Introduction

2 Framework Overview

3 Implementation and Demonstration Outline

4 Related Work

5 Conclusions

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Abstract

Similar content being viewed by others

LODmilla: Shared Visualization of Linked Open Data

Lumina: an adaptive, automated and extensible prototype for exploring, enriching and visualizing data

LDViz: A Tool to Assist the Multidimensional Exploration of SPARQL Endpoints

Keywords

1 Introduction

2 Framework Overview

3 Implementation and Demonstration Outline

4 Related Work

5 Conclusions

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation