Database Foundations for Scalable RDF Processing

  • Katja Hose
  • Ralf Schenkel
  • Martin Theobald
  • Gerhard Weikum
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6848)

Abstract

As more and more data is provided in RDF format, storing huge amounts of RDF data and efficiently processing queries on such data is becoming increasingly important. The first part of the lecture will introduce state-of-the-art techniques for scalably storing and querying RDF with relational systems, including alternatives for storing RDF, efficient index structures, and query optimization techniques. As centralized RDF repositories have limitations in scalability and failure tolerance, decentralized architectures have been proposed. The second part of the lecture will highlight system architectures and strategies for distributed RDF processing. We cover search engines as well as federated query processing, highlight differences to classic federated database systems, and discuss efficient techniques for distributed query processing in general and for RDF data in particular. Moreover, for the last part of this chapter, we argue that extracting knowledge from the Web is an excellent showcase – and potentially one of the biggest challenges – for the scalable management of uncertain data we have seen so far. The third part of the lecture is thus intended to provide a close-up on current approaches and platforms to make reasoning (e.g., in the form of probabilistic inference) with uncertain RDF data scalable to billions of triples.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    RDF Primer & RDF Schema (W3C Rec.2004-02-10), http://www.w3.org/TR/rdf-primer/, http://www.w3.org/TR/rdf-primer/
  2. 2.
    Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for Semantic Web data management. VLDB J. 18(2), 385–406 (2009)CrossRefGoogle Scholar
  3. 3.
    Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Scalable semantic web data management using vertical partitioning. In: Koch, C., Gehrke, J., Garofalakis, M.N., Srivastava, D., Aberer, K., Deshpande, A., Florescu, D., Chan, C.Y., Ganti, V., Kanne, C.-C., Klas, W., Neuhold, E.J. (eds.) VLDB, pp. 411–422. ACM, New York (2007)Google Scholar
  4. 4.
    Aberer, K., Cudré-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-Grid: a self-organizing structured P2P system. SIGMOD Rec 32, 29–33 (2003)CrossRefGoogle Scholar
  5. 5.
    Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Van Pelt, T.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. Theor. Comput. Sci. 78(1), 159–187 (1991)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Antoniou, G., van Harmelen, F.: A Semantic Web Primer (Cooperative Information Systems). MIT Press, Cambridge (2004)Google Scholar
  8. 8.
    Antova, L., Koch, C., Olteanu, D.: MayBMS: Managing incomplete information with probabilistic world-set decompositions. In: ICDE, pp. 1479–1480 (2007)Google Scholar
  9. 9.
    Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix bit loaded: a scalable lightweight join query processor for RDF data. In: Rappa, M., Jones, P., Freire, J., Chakrabarti, S. (eds.) WWW, pp. 41–50. ACM, New York (2010)Google Scholar
  10. 10.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Auer, S., Ngomo, A.-D.N., Lehmann, J.: Introduction to linked data. In: Polleres, A., et al. (eds.) Reasoning Web 2011. LNCS, vol. 6848, pp. 203–250. Springer, Heidelberg (2011)Google Scholar
  12. 12.
    Beeri, C., Ramakrishnan, R.: On the power of magic. J. Log. Program. 10(1/2/3/4), 255–299 (1991)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Benjelloun, O., Sarma, A.D., Halevy, A.Y., Widom, J.: ULDBs: Databases with uncertainty and lineage. In: VLDB, pp. 953–964 (2006)Google Scholar
  14. 14.
    Berners-Lee, T.: Linked Data - Design Issues (2006), http://www.w3.org/DesignIssues/LinkedData.html
  15. 15.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data – The Story So Far. Int. J. Semantic Web. Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  16. 16.
    Boulos, J., Dalvi, N., Mandhani, B., Mathur, S., Ré, C., Suciu, D.: MystiQ: a system for finding more answers by using probabilities. SIGMOD, 891–893 (2005)Google Scholar
  17. 17.
    Bouquet, P., Ghidini, C., Serafini, L.: Querying the Web of Data: A Formal Approach. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds.) ASWC 2009. LNCS, vol. 5926, pp. 291–305. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  18. 18.
    Bravo, H.C., Ramakrishnan, R.: Optimizing MPF queries: decision support and probabilistic inference. SIGMOD, 701–712 (2007)Google Scholar
  19. 19.
    Buitelaar, P., Eigner, T., Declerck, T.: OntoSelect: A Dynamic Ontology Library with Support for Ontology Selection. In: Proceedings of the Demo Session at the International Semantic Web Conference (2004)Google Scholar
  20. 20.
    Cai, M., Frank, M.: RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 650–657 (2004)Google Scholar
  21. 21.
    Cai, M., Frank, M., Chen, J., Szekely, P.: MAAN: A Multi-Attribute Addressable Network for Grid Information Services. In: Proceedings of the 4th International Workshop on Grid Computing, GRID 2003, p. 184 (2003)Google Scholar
  22. 22.
    Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs. Journal of Web Semantics 3, 247–267 (2005)CrossRefGoogle Scholar
  23. 23.
    Carroll, J.J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K.: Jena: implementing the Semantic Web recommendations. In: Feldman, S.I., Uretsky, M., Najork, M., Wills, C.E. (eds.) WWW (Alternate Track Papers & Posters), pp. 74–83. ACM, New York (2004)Google Scholar
  24. 24.
    Cavallo, R., Pittarelli, M.: The theory of probabilistic databases. In: VLDB, pp. 71–81. Morgan Kaufmann, San Francisco (1987)Google Scholar
  25. 25.
    Chen, H., Wang, Y., Wang, H., Mao, Y., Tang, J., Zhou, C., Yin, A., Wu, Z.: Towards a Semantic Web of relational databases: A practical semantic toolkit and an in-use case from traditional Chinese medicine. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 750–763. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  26. 26.
    Cheng, G., Qu, Y.: Searching Linked Objects with Falcons: Approach, Implementation and Evaluation. Int. J. Semantic Web Inf. Syst. 5(3), 49–70 (2009)CrossRefGoogle Scholar
  27. 27.
    Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: Böhm, K., Jensen, C.S., Haas, L.M., Kersten, M.L., Larson, P.-Å., Ooi, B.C. (eds.) VLDB, pp. 1216–1227. ACM, New York (2005)Google Scholar
  28. 28.
    Clark, K.L.: Negation as failure. In: Logic and Data Bases, pp. 293–322. Plenum Press, New York (1978)CrossRefGoogle Scholar
  29. 29.
    Cruz, I.F., Kashyap, V., Decker, S., Eckstein, R. (eds.): Proceedings of SWDB 2003, The first International Workshop on Semantic Web and Databases, Co-located with VLDB 2003, September 7-8. Humboldt-Universität, Berlin (2003)Google Scholar
  30. 30.
    Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: VLDB, pp. 864–875 (2004)Google Scholar
  31. 31.
    Dalvi, N., Suciu, D.: The dichotomy of conjunctive queries on probabilistic structures. In: PODS Conference, pp. 293–302 (2007)Google Scholar
  32. 32.
    Dalvi, N.N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86–94 (2009)CrossRefGoogle Scholar
  33. 33.
    Damlen, P., Wakefield, J., Walker, S.: Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(2), 331–344 (1999)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    d’Aquin, M., Baldassarre, C., Gridinoc, L., Angeletou, S., Sabou, M., Motta, E.: Characterizing Knowledge on the Semantic Web with Watson. In: EON, pp. 1–10 (2007)Google Scholar
  35. 35.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)CrossRefGoogle Scholar
  36. 36.
    Dechter, R.: Bucket elimination: A unifying framework for reasoning. Artif. Intell. 113(1-2), 41–85 (1999)MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: a search and metadata engine for the semantic web. In: CIKM 2004: Proceedings of the thirteenth ACM International Conference on Information and Knowledge Management, pp. 652–659 (2004)Google Scholar
  38. 38.
    Ding, Y., Sun, Y., Chen, B., Borner, K., Ding, L., Wild, D., Wu, M., DiFranzo, D., Fuenzalida, A.G., Li, D., Milojevic, S., Chen, S., Sankaranarayanan, M., Toma, I.: Semantic web portal: a platform for better browsing and visualizing semantic data. In: Proceedings of the 6th International Conference on Active Media Technology, AMT 2010, pp. 448–460 (2010)Google Scholar
  39. 39.
    Dylla, M., Sozio, M., Theobald, M.: Resolving temporal conflicts in inconsistent rdf knowledge bases. In: BTW, pp. 474–493 (2011)Google Scholar
  40. 40.
    Erling, O., Mikhailov, I.: Towards web-scale rdf, http://virtuoso.openlinksw.com/whitepapers/Web-Scale%20RDF.pdf
  41. 41.
    Erling, O., Mikhailov, I.: RDF Support in the Virtuoso DBMS. In: Pellegrini, T., Auer, S., Tochtermann, K., Schaffert, S. (eds.) Networked Knowledge - Networked Media. SCI, vol. 221, pp. 7–24. Springer, Berlin (2009)CrossRefGoogle Scholar
  42. 42.
    Fletcher, G.H.L., Beck, P.W.: Scalable indexing of RDF graphs for efficient join processing. In: Cheung, D.W.-L., Song, I.-Y., Chu, W.W., Hu, X., Lin, J.J. (eds.) CIKM, pp. 1513–1516. ACM, New York (2009)CrossRefGoogle Scholar
  43. 43.
    Frakes, W.B., Baeza-Yates, R.A. (eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall, Englewood Cliffs (1992)Google Scholar
  44. 44.
    Fuhr, N.: Probabilistic Datalog - a logic for powerful retrieval methods. In: SIGIR, pp. 282–290 (1995)Google Scholar
  45. 45.
    Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: Logic Programming, pp. 1070–1080. MIT Press, Cambridge (1988)Google Scholar
  46. 46.
    Getoor, L., Taskar, B.: An Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)MATHGoogle Scholar
  47. 47.
    Gilks, W., Richardson, S., Spiegelhalter, D.J.S.: Markov Chain Monte Carlo in Practice. Chapman and Hall, Boca Raton (1996)MATHGoogle Scholar
  48. 48.
    Goemans, M.X., Williamson, D.P.: New 3/4-approximation algorithms for the maximum satisfiability problem. SIAM J. Discrete Math. 7(4), 656–666 (1994)MathSciNetCrossRefMATHGoogle Scholar
  49. 49.
    Gonzalez, J.E., Low, Y., Guestrin, C.: Residual splash for optimally parallelizing belief propagation. In: Artificial Intelligence and Statistics (AISTATS), pp. 177–184 (2009)Google Scholar
  50. 50.
    Gonzalez, J.E., Low, Y., Guestrin, C., O’Hallaron, D.: Distributed parallel inference on large factor graphs. In: Uncertainty in Artificial Intelligence (UAI), pp. 203–212 (2009)Google Scholar
  51. 51.
    Görlitz, O., Staab, S.: Federated Data Management and Query Optimization for Linked Open Data,  ch. 5, pp. 109–137. Springer, Heidelberg (2011)Google Scholar
  52. 52.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. J. Web Sem. 3(2-3), 158–182 (2005)CrossRefGoogle Scholar
  53. 53.
    Haas, P.J., Jermaine, C.M., Arumugam, S., Xu, F., Perez, L.L., Jampani, R.: MCDB-R: Risk analysis in the database. PVLDB 3(1), 782–793 (2010)Google Scholar
  54. 54.
    Haase, P., Mathäß, T., Ziller, M.: An evaluation of approaches to federated query processing over linked data. In: Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS 2010, pp. 5:1–5:9 (2010)Google Scholar
  55. 55.
    Haase, P., Wang, Y.: A decentralized infrastructure for query answering over distributed ontologies. In: Proceedings of the 2007 ACM symposium on Applied computing, SAC 2007, pp. 1351–1356 (2007)Google Scholar
  56. 56.
    Harris, S., Gibbins, N.: 3store: Efficient bulk RDF storage. In: Volz, R., Decker, S., Cruz, I.F. (eds.) PSSS. CEUR Workshop Proceedings, vol. 89 (2003)Google Scholar
  57. 57.
    Harth, A.: VisiNav: Visual web data search and navigation. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 214–228. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  58. 58.
    Harth, A., Hogan, A., Delbru, R., Umbrich, J., O’Riain, S., Decker, S.: SWSE: Answers Before Links! In: Semantic Web Challenge (2007)Google Scholar
  59. 59.
    Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K., Umbrich, J.: Data Summaries for On-Demand Queries over Linked Data. In: WWW 2010, pp. 411–420 (2010)Google Scholar
  60. 60.
    Harth, A., Umbrich, J., Hogan, A., Decker, S.: YARS2: A federated repository for querying graph structured data from the web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 211–224. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  61. 61.
    Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  62. 62.
    Hartig, O., Heese, R.: The SPARQL query graph model for query optimization. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 564–578. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  63. 63.
    Hartig, O., Langegger, A.: A Database Perspective on Consuming Linked Data on the Web. Datenbank-Spektrum 10(2), 57–66 (2010)CrossRefGoogle Scholar
  64. 64.
    Hellerstein, J.M.: The declarative imperative: experiences and conjectures in distributed logic. SIGMOD Record 39(1), 5–19 (2010)CrossRefGoogle Scholar
  65. 65.
    Hogan, A., Harth, A., Decker, S.: ReConRank: A Scalable Ranking Method for Semantic Web Data with Context. In: 2nd Workshop on Scalable Semantic Web Knowledge Base Systems (2006)Google Scholar
  66. 66.
    Huang, J., Antova, L., Koch, C., Olteanu, D.: MayBMS: a probabilistic database management system. SIGMOD, 1071–1074 (2009)Google Scholar
  67. 67.
    Imielinski, T., Lipski Jr., W.: Incomplete information in relational databases. J. ACM 31(4), 761–791 (1984)MathSciNetCrossRefMATHGoogle Scholar
  68. 68.
    Jampani, R., Xu, F., Wu, M., Perez, L.L., Jermaine, C.M., Haas, P.J.: MCDB: a Monte Carlo approach to managing uncertain data. In: Wang, J.T.-L. (ed.) SIGMOD, pp. 687–700. ACM, New York (2008)CrossRefGoogle Scholar
  69. 69.
    Jaumard, B., Simeone, B.: On the complexity of the maximum satisfiability problem for Horn formulas. Information Processing Letters 26(1), 1–4 (1987)MathSciNetCrossRefMATHGoogle Scholar
  70. 70.
    Jha, A., Rastogi, V., Suciu, D.: Query evaluation with soft-key constraints. In: PODS, pp. 119–128 (2008)Google Scholar
  71. 71.
    Kanagal, B., Deshpande, A.: Lineage processing over correlated probabilistic databases. In: SIGMOD, pp. 675–686 (2010)Google Scholar
  72. 72.
    Kanellakis, P.C., Smolka, S.A.: CCS expressions finite state processes, and three problems of equivalence. Inf. Comput. 86, 43–68 (1990)MathSciNetCrossRefMATHGoogle Scholar
  73. 73.
    Karp, R.M., Luby, M.: Monte-Carlo algorithms for enumeration and reliability problems. In: FOCS, pp. 56–64 (1983)Google Scholar
  74. 74.
    Kautz, H., Selman, B., Jiang, Y.: A general stochastic approach to solving problems with hard and soft constraints. In: The Satisfiability Problem: Theory and Applications, pp. 573–586. American Mathematical Society, Providence (1996)Google Scholar
  75. 75.
    Koch, C.: A compositional query algebra for second-order logic and uncertain databases. In: ICDT, pp. 127–140 (2009)Google Scholar
  76. 76.
    Kossmann, D.: The state of the art in distributed query processing. ACM Comput. Surv. 32, 422–469 (2000)CrossRefGoogle Scholar
  77. 77.
    Kowalski, R.A., Kuehner, D.: Linear resolution with selection function. Artif. Intell. 2(3/4), 227–260 (1971)MathSciNetCrossRefMATHGoogle Scholar
  78. 78.
    Ladwig, G., Tran, T.: Linked Data Query Processing Strategies. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 453–469. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  79. 79.
    Langegger, A., Wöß, W., Blöchl, M.: A semantic web middleware for virtual data integration on the web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 493–507. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  80. 80.
    Levandoski, J.J., Mokbel, M.F.: RDF data-centric storage. In: ICWS, pp. 911–918. IEEE Computer Society Press, Los Alamitos (2009)Google Scholar
  81. 81.
    Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. PVLDB 2(1), 502–513 (2009)Google Scholar
  82. 82.
    Liang, S., Fodor, P., Wan, H., Kifer, M.: OpenRuleBench: an analysis of the performance of rule engines. In: WWW, pp. 601–610. ACM, New York (2009)CrossRefGoogle Scholar
  83. 83.
    Liarou, E., Idreos, S., Koubarakis, M.: Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  84. 84.
    Liu, B., Hu, B.: Path queries based RDF index. In: SKG, p. 91. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  85. 85.
    Baolin, L., Bo, H.: HPRD: A high performance RDF database. In: Li, K., Jesshope, C., Jin, H., Gaudiot, J.-L. (eds.) NPC 2007. LNCS, vol. 4672, pp. 364–374. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  86. 86.
    Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: GraphLab: A new parallel framework for machine learning. In: Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California (2010)Google Scholar
  87. 87.
    Lukasiewicz, T.: Probabilistic description logic programs. Int. J. Approx. Reasoning 45(2), 288–307 (2007)MathSciNetCrossRefMATHGoogle Scholar
  88. 88.
    Chang, N.R.M., Ratinov, L., Roth, D.: Learning and inference with constraints. In: AAAI (2008)Google Scholar
  89. 89.
    Maduko, A., Anyanwu, K., Sheth, A.P., Schliekelman, P.: Graph summaries for subgraph frequency estimation. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 508–523. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  90. 90.
    Maduko, A., Anyanwu, K., Sheth, A.P., Schliekelman, P.: Estimating the cardinality of RDF graph patterns. In: Williamson, C.L., Zurko, M.E., Patel-Schneider, P.F., Shenoy, P.J. (eds.) WWW, pp. 1233–1234. ACM, New York (2007)CrossRefGoogle Scholar
  91. 91.
    Maduko, A., Anyanwu, K., Sheth, A.P., Schliekelman, P.: Graph summaries for subgraph frequency estimation. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 508–523. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  92. 92.
    Matono, A., Amagasa, T., Yoshikawa, M., Uemura, S.: An indexing scheme for RDF and RDF schema based on suffix arrays. In: Cruz, et al. [29], pp. 151–168Google Scholar
  93. 93.
    McCallum, A., Schultz, K., Singh, S.: FACTORIE: Probabilistic programming via imperatively defined factor graphs. In: NIPS (2009)Google Scholar
  94. 94.
    Mendelzon, A.O., Milo, T.: Formal models of Web queries. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, PODS 1997, pp. 134–143 (1997)Google Scholar
  95. 95.
    Michelakis, E., Krishnamurthy, R., Haas, P.J., Vaithyanathan, S.: Uncertainty management in rule-based information extraction systems. SIGMOD, 101–114 (2009)Google Scholar
  96. 96.
    Mutsuzaki, M., Theobald, M., de Keijzer, A., Widom, J., Agrawal, P., Benjelloun, O., Sarma, A.D., Murthy, R., Sugihara, T.: Trio-One: Layering uncertainty and lineage on a conventional DBMS (demo). In: CIDR, pp. 269–274 (2007)Google Scholar
  97. 97.
    Nakashole, N., Theobald, M., Weikum, G.: Scalable knowledge harvesting with high precision and high recall. In: WSDM, pp. 227–236 (2011)Google Scholar
  98. 98.
    Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmér, M., Risch, T.: EDUTELLA: a P2P networking infrastructure based on RDF. In: WWW 2002: Proceedings of the 11th International Conference on World Wide Web, pp. 604–615. ACM Press, New York (2002)Google Scholar
  99. 99.
    Neumann, T., Weikum, G.: Rdf-3x: a risc-style engine for rdf. PVLDB 1(1), 647–659 (2008)Google Scholar
  100. 100.
    Neumann, T., Weikum, G.: Scalable join processing on very large RDF graphs. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) SIGMOD Conference, pp. 627–640. ACM, New York (2009)Google Scholar
  101. 101.
    Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of rdf data. VLDB J 19(1), 91–113 (2010)CrossRefGoogle Scholar
  102. 102.
    Niemelä, I., Simons, P.: Smodels - an implementation of the stable model and well-founded semantics for normal logic programs. In: Logic Programming and Nonmonotonic Reasoning, Springer, Heidelberg (1997)Google Scholar
  103. 103.
    Niu, F., Ré, C., Doan, A., Shavlik, J.: Tuffy: scaling up statistical inference in Markov logic networks using an RDBMS. Technical report, University of Wisconsin-Madison (2010)Google Scholar
  104. 104.
    Nottelmann, H., Fuhr, N.: Adding probabilities and rules to OWL lite subsets based on probabilistic Datalog. Int. Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 14(1), 17–41 (2006)MathSciNetCrossRefMATHGoogle Scholar
  105. 105.
    Obermeier, P., Nixon, L.: A Cost Model for Querying Distributed RDF-Repositories with SPARQL. In: Workshop on Advancing Reasoning on the Web: Scalability and Commonsense (2008)Google Scholar
  106. 106.
    Olteanu, D., Huang, J., Koch, C.: SPROUT: Lazy vs. eager query plans for tuple-independent probabilistic databases. In: ICDE, pp. 640–651. IEEE, Los Alamitos (2009)Google Scholar
  107. 107.
    Olteanu, D., Huang, J., Koch, C.: Approximate confidence computation in probabilistic databases. In: ICDE, pp. 145–156 (2010)Google Scholar
  108. 108.
    Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice.com: a document-oriented lookup index for open linked data. Int. J. Metadata Semant. Ontologies 3, 37–52 (2008)CrossRefGoogle Scholar
  109. 109.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66, Stanford InfoLab (November 1999)Google Scholar
  110. 110.
    Palma, R., Haase, P.: Oyster - Sharing and Re-using Ontologies in a Peer-to-Peer Community. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 1059–1062. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  111. 111.
    Pan, J.Z., Thomas, E., Sleeman, D.: Ontosearch2: Searching and querying web ontologies. In: Proc. of the IADIS International Conference, pp. 211–218 (2006)Google Scholar
  112. 112.
    Patel, C., Supekar, K., Lee, Y., Park, E.K.: OntoKhoj: a semantic web portal for ontology searching, ranking and classification. In: Proceedings of the 5th ACM International Workshop on Web Information and Data Management, WIDM 2003, pp. 58–61 (2003)Google Scholar
  113. 113.
    Poon, H., Domingos, P.: Sound and efficient inference with probabilistic and deterministic dependencies. In: AAAI. AAAI Press, Menlo Park (2006)Google Scholar
  114. 114.
    Poon, H., Domingos, P., Sumner, M.: A general method for reducing the complexity of relational inference and its application to MCMC. In: AAAI, pp. 1075–1080 (2008)Google Scholar
  115. 115.
    Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  116. 116.
    Re, C., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE, pp. 886–895 (2007)Google Scholar
  117. 117.
    Re, C., Suciu, D.: Managing probabilistic data with mystiQ: The can-do, the could-do, and the can’t-do. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 5–18. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  118. 118.
    Richardson, M., Domingos, P.: Markov logic networks. Machine Learning 62(1-2) (2006)Google Scholar
  119. 119.
    Riedel, S.: Cutting plane MAP inference for Markov Logic. In: International Workshop on Statistical Relational Learning, SRL (2009)Google Scholar
  120. 120.
    Roth, D.: On the hardness of approximate reasoning. Artif. Intell. 82, 273–302 (1996)MathSciNetCrossRefGoogle Scholar
  121. 121.
    Roth, D., Yih, W.: Integer linear programming inference for conditional random fields. In: Proc. of the International Conference on Machine Learning (ICML), pp. 737–744 (2005)Google Scholar
  122. 122.
    Sakr, S., Al-Naymat, G.: Relational processing of rdf queries: a survey. SIGMOD Record 38(4), 23–28 (2009)CrossRefGoogle Scholar
  123. 123.
    Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: ICDE, p. 7 (2006)Google Scholar
  124. 124.
    Sarma, A.D., Theobald, M., Widom, J.: Exploiting lineage for confidence computation in uncertain and probabilistic databases. In: ICDE, pp. 1023–1032 (2008)Google Scholar
  125. 125.
    Das Sarma, A., Theobald, M., Widom, J.: LIVE: A lineage-supported versioned DBMS. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 416–433. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  126. 126.
    Schenk, S., Staab, S.: Networked graphs: a declarative mechanism for SPARQL rules, SPARQL views and RDF data integration on the Web. In: Proceeding of the 17th International Conference on World Wide Web, WWW 2008, pp. 585–594 (2008)Google Scholar
  127. 127.
    Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: ICDE, pp. 596–605 (2007)Google Scholar
  128. 128.
    Sen, P., Deshpande, A., Getoor, L.: PrDB: managing and exploiting rich correlations in probabilistic databases. VLDB J. 18(5), 1065–1090 (2009)CrossRefGoogle Scholar
  129. 129.
    Sen, P., Deshpande, A., Getoor, L.: Read-once functions and query evaluation in probabilistic databases. PVLDB 3(1), 1068–1079 (2010)Google Scholar
  130. 130.
    Sidirourgos, L., Goncalves, R., Kersten, M.L., Nes, N., Manegold, S.: Column-store support for RDF data management: not all swans are white. PVLDB 1(2), 1553–1563 (2008)Google Scholar
  131. 131.
    Singh, S., Mayfield, C., Mittal, S., Prabhakar, S., Hambrusch, S.E., Shah, R.: Orion 2.0: native support for uncertain data. SIGMOD, 1239–1242 (2008)Google Scholar
  132. 132.
    Singh, S., Mayfield, C., Shah, R., Prabhakar, S., Hambrusch, S.E., Neville, J., Cheng, R.: Database support for probabilistic attributes and tuples. In: ICDE, pp. 1053–1061 (2008)Google Scholar
  133. 133.
    Singla, P., Domingos, P.: Memory-efficient inference in relational domains. In: AAAI (2006)Google Scholar
  134. 134.
    Soliman, M.A., Ilyas, I.F., Chang, K.C.: URank: formulation and efficient evaluation of top-k queries in uncertain databases. SIGMOD, 1082–1084 (2007)Google Scholar
  135. 135.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Huai, J., Chen, R., Hon, H.-W., Liu, Y., Ma, W.-Y., Tomkins, A., Zhang, X. (eds.) WWW, pp. 595–604. ACM, New York (2008)CrossRefGoogle Scholar
  136. 136.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proceeding of the 17th International Conference on World Wide Web, WWW 2008, pp. 595–604 (2008)Google Scholar
  137. 137.
    Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11, 17–32 (2003)CrossRefGoogle Scholar
  138. 138.
    Straccia, U.: Managing Uncertainty and Vagueness in Description Logics, Logic Programs and Description Logic Programs. In: Baroglio, C., Bonatti, P.A., Małuszyński, J., Marchiori, M., Polleres, A., Schaffert, S. (eds.) Reasoning Web. LNCS, vol. 5224, pp. 54–103. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  139. 139.
    Stuckenschmidt, H., Vdovjak, R., Houben, G.-J., Broekstra, J.: Index structures and algorithms for querying distributed RDF repositories. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 631–639 (2004)Google Scholar
  140. 140.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: WWW, pp. 697–706 (2007)Google Scholar
  141. 141.
    Suchanek, F.M., Sozio, M., Weikum, G.: SOFIE: a self-organizing framework for information extraction. In: WWW, pp. 631–640 (2009)Google Scholar
  142. 142.
    Systeme, A.W., Gottlob, G., Voronkov, A., Dantsin, E., Dantsin, E., Eiter, T., Eiter, T.: Complexity and expressive power of logic programming (1999)Google Scholar
  143. 143.
    Terracina, G., Leone, N., Lio, V., Panetta, C.: Experimenting with recursive queries in database and logic programming systems. Theory Pract. Log. Program. 8, 129–165 (2008)MathSciNetCrossRefMATHGoogle Scholar
  144. 144.
    Theobald, M., Sozio, M., Suchanek, F., Nakashole, N.: URDF: Efficient reasoning in uncertain RDF knowledge bases with soft and hard rules. Technical Report MPII20105-002, Max Planck Institute Informatics, MPI-INF (2010)Google Scholar
  145. 145.
    Theoharis, Y., Christophides, V., Karvounarakis, G.: Benchmarking database representations of RDF/S stores. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 685–701. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  146. 146.
    Tran, T., Haase, P., Studer, R.: Semantic search – using graph-structured semantic models for supporting the search process. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds.) ICCS 2009. LNCS, vol. 5662, pp. 48–65. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  147. 147.
    Tran, T., Wang, H., Haase, P.: Hermes: Data Web search on a pay-as-you-go integration infrastructure. Web Semant. 7, 189–203 (2009)CrossRefGoogle Scholar
  148. 148.
    Tummarello, G., Cyganiak, R., Catasta, M., Danielczyk, S., Delbru, R., Decker, S.: Sig.ma: live views on the web of data. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 1301–1304 (2010)Google Scholar
  149. 149.
    Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A graph based RDF index. In: AAAI, pp. 1465–1470. AAAI Press, Menlo Park (2007)Google Scholar
  150. 150.
    Wang, D.Z., Michelakis, E., Franklin, M.J., Garofalakis, M.N., Hellerstein, J.M.: Probabilistic declarative information extraction. In: ICDE, pp. 173–176 (2010)Google Scholar
  151. 151.
    Wang, D.Z., Michelakis, E., Garofalakis, M.N., Hellerstein, J.M.: BayesStore: managing large, uncertain data repositories with probabilistic graphical models. PVLDB 1(1), 340–351 (2008)Google Scholar
  152. 152.
    Wang, Y., Yahya, M., Theobald, M.: Time-aware reasoning in uncertain knowledge bases. In: Workshop on Management of Uncertain Data, MUD (2010)Google Scholar
  153. 153.
    Warren, D.S.: Memoing for logic programs. Commun. ACM 35, 93–111 (1992)CrossRefGoogle Scholar
  154. 154.
    Wei, W., Erenrich, J., Selman, B.: Towards efficient sampling: Exploiting random walk strategies. In: AAAI, pp. 670–676 (2004)Google Scholar
  155. 155.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for Semantic Web data management. PVLDB 1(1), 1008–1019 (2008)Google Scholar
  156. 156.
    Wick, M.L., McCallum, A., Miklau, G.: Scalable probabilistic databases with factor graphs and mcmc. PVLDB 3(1), 794–804 (2010)Google Scholar
  157. 157.
    Wilkinson, K., Sayers, C., Kuno, H., Reynolds, D.: Efficient RDF Storage and Retrieval in Jena2. In: First International Workshop on Semantic Web and Databases (SWDB 2003), pp. 131–150 (2003)Google Scholar
  158. 158.
    Wilkinson, K., Sayers, C., Kuno, H.A., Reynolds, D.: Efficient RDF storage and retrieval in Jena2. In: Cruz, et al [29], pp. 131–150Google Scholar
  159. 159.
    Xu, F., Beyer, K.S., Ercegovac, V., Haas, P.J., Shekita, E.J.: E = MC\(^{\mbox{3}}\): managing uncertain enterprise data in a cluster-computing environment. SIGMOD, 441–454 (2009)Google Scholar
  160. 160.
    Zhou, M., Wu, Y.: XML-based RDF data management for efficient query processing. In: Dong, X.L., Naumann, F. (eds.) WebDB (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Katja Hose
    • 1
  • Ralf Schenkel
    • 1
    • 2
  • Martin Theobald
    • 1
  • Gerhard Weikum
    • 1
  1. 1.Max-Planck-Institut für InformatikSaarbrückenGermany
  2. 2.Saarland UniversitySaarbrückenGermany

Personalised recommendations