Storing the Semantic Web: Repositories

Abstract

Semantic repositories are database management systems, capable of handling structured data, taking into consideration their semantics. The Semantic Web represents the next-generation Web of Data, where information is published and interlinked in a way, which facilitates both humans and machines to exploit its structure and meaning. To foster the realization of the Semantic Web, the World Wide Web Consortium (W3C) developed a series of metadata, ontology, and query languages for it. Following the enthusiasm about the Semantic Web and the wide adoption of the related standards, today, most of the semantic repositories are database engines, which deal with data represented in RDF, support SPARQL queries, and can interpret schemas and ontologies represented in RDFS and OWL. Naturally, such engines take the role of Web servers of the Semantic Web.

This chapter starts with an introduction to semantic repositories and discussion on their links to several other technology trends, including relational databases, column-stores, and expert systems. As the most distinguishing quality of the semantic repositories is reasoning, an overview of the strategies for the integration of inference in the data management life cycle is presented. An overall view of the mechanics of the engines is provided from the perspective of a conceptual framework that reveals all their tasks and activities (e.g., storage and retrieval) along with the factors that impact their performance (e.g., data size and complexity). A review of several design issues, including distribution, serves as a basis for understanding the different implementation approaches and their implications on the performance of semantic repositories. Several of the most popular benchmarks and datasets, which are often used as measuring sticks for the performance of the engines, and few of the outstanding semantic repositories, are presented along with the best published evaluation results.

The advantages and the typical applications of semantic repositories are presented focusing on two usage scenarios: reasoning with and the management of linked data (a popular trend in the Semantic Web) and enterprise data integration. The chapter ends with some considerations regarding the future development of semantic repositories and design topics like adaptive indexing and interoperability patterns.