Abstract
In order to cope with the expected size of the Semantic Web (SW) in the coming years, we need to benchmark existing SW tools (e.g., query language interpreters) in a credible manner. In this paper we present the first RDFS schema generator, termed PoweRGen, which takes into account the morphological features that schemas frequently exhibit in reality. In particular, we are interested in generating synthetically the two core components of an RDFS schema, namely the property (relationships between classes or attributes) and the subsumption (subsumption relationships among classes) graph. The total-degree distribution of the former, as well as the out-degree distribution of the Transitive Closure (TC) of the latter, usually follow a power-law. PoweRGen produces synthetic property and subsumption graphs whose distributions respect the power-law exponents given as input with a confidence ranging between 90 − 98%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis, D., Tolle, K.: The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases. In: 2nd International Workshop on the Semantic Web (May 2001)
Barbosa, D., Mendelzon, A., Keenleyside, J., Lyons, K.: ToXgene: A template-based data generator for XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, New York, p. 616 (2002)
Barbosa, D., Mignet, L., Veltri, P.: Studying the XML Web: Gathering Statistics from an XML Sample. World Wide Web Journal 8(4) (2005)
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (May 2001)
Blitzstein, J., Diaconis, P.: A Sequential Importance Sampling Algorithm For Generating Random Graphs With Prescribed Degrees. Annals of Applied Probabilty (2005)
Brickley, D., Guha, R.V.: RDF Vocabulary Description Language 1.0: RDF Schema, W3C Recommendation (February 10, 2004)
Chakrabarti, D., Faloutsos, C.: Graph Mining: Laws, Generators, and Algorithms. ACM Computing Surveys (CSUR) 38(2) (2006)
Claude, B.: Graphs and Hypergraphs. North Holland Publishing Company, Amsterdam (1973)
Wang, T.D.: Gauging Ontologies and Schemas by Numbers. In: Proc. Fourth Int’l Workshop Evaluation of Ontology-Based Tools (EON) (2006)
Erdös, P., Gallai, T.: Graphs with Prescribed Degree of Vertices. Mat. Lapok 11, 264–274 (1960)
Guo, Y., Heflin, J., Pan, Z.: Benchmarking DAML+OIL Repositories. In: Procs of the 2nd International Semantic Web Conference, Florida, USA, pp. 613–627 (2003)
Hakimi, S.: On the Realizability of a Set of Integers as Degrees of the Vertices of a Graph. SIAM 10, 496–506 (1962)
Havel, V.: A Remark on the Existence of Finite Graphs. Casopis Pest. Mat. 80, 477–480 (1955)
Hayes, P.: RDF Semantics. W3C Recommendation (February 10, 2004)
Hoffman, A.J., Kruskal, J.B.: Integral Bounding Points of Convex Polyedra. In: Linear Inequalities and Related Systems, pp. 223–246. Princeton University Press, Princeton (1956)
Hsu, M., Cheatham, T.E.: Rule Execution in CPLEX: A Persistent Objectbase. In: Advances in Object-Oriented Database Systems: Proc. of the 2nd International Workshop (1988)
Leskovec, J., Chakrabarti, D., Kleinberg, J.M., Faloutsos, C.: Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 133–145. Springer, Heidelberg (2005)
Mihail, M., Visnoi, N.: On Generating Graphs with Prescribed Degree Sequences for Complex Network Modeling Applications. In: Procs of Approx. and Randomized Algorithms for Communication Networks (ARACNE) (2002)
Perry, M.: Test ontology generation tool, http://lsdis.cs.uga.edu/projects/semdis/tontogen/
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn. Cambridge University Press, Cambridge (1992)
Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for XML data management. In: Procs of the 28th International Conference on Very Large Data Bases, Hong Kong, China, pp. 974–985 (2002)
Svab, O., Svatek, V.: Vitro Study of Mapping Method Interactions in a Name Pattern Landscape. In: 2nd International Workshop on Ontology Matching, collocated with ISWC 2007 (2007)
Theoharis, Y.: On Power Laws and the Semantic Web. Master’s thesis, Computer Science Department, University of Crete (February 2007), http://athena.ics.forth.gr:9090/RDF/publications/MasterThesisTheohari.pdf
Theoharis, Y., Christophides, V., Karvounarakis, G.: Benchmarking Database Representations of RDF/S Stores. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 685–701. Springer, Heidelberg (2005)
Theoharis, Y., Tzitzikas, Y., Kotzinos, D., Christophides, V.: On Graph Features of Semantic Web Schemas. IEEE Transactions on Knowledge and Data Engineering 20(5), 692–702 (2008)
Tzitzikas, Y., Kotzinos, D., Theoharis, Y.: On Ranking RDF Schema Elements (and its Application in Visualization). Journal of Universal Computer Science, Special Issue: Ontologies and their Application 13(12), 1854–1880 (2007)
Veinott, A.F., Dantzig, G.B.: Integral Extreme Points. SIAM Review 10(3), 371–372 (1968)
W3C. W3C Semantic Web Activity, W3C Workshop on RDF Access to Relational Databases (October 25-26, 2007)
Wunderling, R.: Paralleler und Objektorientierter Simplex-Algorithmus. Ph.D. thesis, ZIB (1996)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Theoharis, Y., Georgakopoulos, G., Christophides, V. (2008). On the Synthetic Generation of Semantic Web Schemas. In: Christophides, V., Collard, M., Gutierrez, C. (eds) Semantic Web, Ontologies and Databases. ODBIS SWDB 2007 2007. Lecture Notes in Computer Science, vol 5005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70960-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-70960-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70959-6
Online ISBN: 978-3-540-70960-2
eBook Packages: Computer ScienceComputer Science (R0)