Abstract
An increasing amount of data is becoming available in the form of large triple stores, with the Semantic Web’s linked open data cloud (LOD) as one of the most prominent examples. Data quality and completeness are key issues in many community-generated data stores, like LOD, which motivates probabilistic and statistical approaches to data representation, reasoning and querying. In this paper we address the issue from the perspective of probabilistic databases, which account for uncertainty in the data via a probability distribution over all database instances. We obtain a highly compressed representation using the recently developed RESCAL approach and demonstrate experimentally that efficient querying can be obtained by exploiting inherent features of RESCAL via sub-query approximations of deterministic views.
Chapter PDF
References
Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. In: AAAI (2011)
Boulos, J., Dalvi, N.N., Mandhani, B., Mathur, S., Ré, C., Suciu, D.: Mystiq: a system for finding more answers by using probabilities. In: SIGMOD Conference, pp. 891–893 (2005)
Calì, A., Lukasiewicz, T., Predoiu, L., Stuckenschmidt, H.: Tightly integrated probabilistic description logic programs for representing ontology mappings. In: Hartmann, S., Kern-Isberner, G. (eds.) FoIKS 2008. LNCS, vol. 4932, pp. 178–198. Springer, Heidelberg (2008)
da Costa, P.C.G., Laskey, K.B., Laskey, K.J.: Pr-owl: A bayesian ontology language for the semantic web. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 88–107. Springer, Heidelberg (2008)
Dalvi, N.N., Re, C., Suciu, D.: Queries and materialized views on probabilistic databases. J. Comput. Syst. Sci. 77(3), 473–490 (2011)
Ding, Z., Peng, Y., Pan, R.: A bayesian approach to uncertainty modelling in owl ontology. In: Proceedings of the International Conference on Advances in Intelligent Systems - Theory and Applications (2004)
Dylla, M., Miliaraki, I., Theobald, M.: Top-k query processing in probabilistic databases with non-materialized views. Research Report MPI-I-2012-5-002, Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany (June 2012)
Franz, T., Schultz, A., Sizov, S., Staab, S.: Triplerank: Ranking semantic web data by tensor decomposition. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 213–228. Springer, Heidelberg (2009)
Giugno, R., Lukasiewicz, T.: P-\(\mathcal{SHOQ}({\bf D})\): A probabilistic extension of \(\mathcal{SHOQ}({\bf D})\) for probabilistic ontologies in the semantic web. In: Flesca, S., Greco, S., Leone, N., Ianni, G. (eds.) JELIA 2002. LNCS (LNAI), vol. 2424, pp. 86–97. Springer, Heidelberg (2002)
Huang, J., Antova, L., Koch, C., Olteanu, D.: Maybms: a probabilistic database management system. In: SIGMOD Conference (2009)
Jenatton, R., Roux, N.L., Bordes, A., Obozinski, G.: A latent factor model for highly multi-relational data. In: NIPS (2012)
Kolda, T.G., Bader, B.W., Kenny, J.P.: Higher-order web link analysis using multilinear algebra. In: ICDM, pp. 242–249 (2005)
Laub, A.J.: Matrix analysis - for scientists and engineers. SIAM (2005)
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal (2014)
Lukasiewicz, T.: Expressive probabilistic description logics. Artif. Intell. 172(6-7), 852–883 (2008)
Mutsuzaki, M., Theobald, M., de Keijzer, A., Widom, J., Agrawal, P., Benjelloun, O., Sarma, A.D., Murthy, R., Sugihara, T.: Trio-one: Layering uncertainty and lineage on a conventional dbms (demo). In: CIDR, pp. 269–274 (2007)
Nickel, M.: Tensor factorization for relational learning. PhDThesis, p. 48, 49, 74, Ludwig-Maximilian-University of Munich (August 2013)
Nickel, M., Tresp, V.: Logistic tensor factorization for multi-relational data. In: Structured Learning: Inferring Graphs from Structured and Unstructured Inputs, ICML WS (2013)
Nickel, M., Tresp, V., Kriegel, H.-P.: A three-way model for collective learning on multi-relational data. In: ICML, pp. 809–816 (2011)
Nickel, M., Tresp, V., Kriegel, H.-P.: Factorizing yago: scalable machine learning for linked data. In: WWW, pp. 271–280 (2012)
Olteanu, D., Wen, H.: Ranking query answers in probabilistic databases: Complexity and efficient algorithms. In: ICDE, pp. 282–293 (2012)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)
Rendle, S., Marinho, L.B., Nanopoulos, A., Schmidt-Thieme, L.: Learning optimal ranking with tensor factorization for tag recommendation. In: KDD, pp. 727–736 (2009)
Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: HLT-NAACL, pp. 74–84 (2013)
Christopher, R., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE, pp. 886–895 (2007)
Singh, S., Mayfield, C., Mittal, S., Prabhakar, S., Hambrusch, S.E., Shah, R.: Orion 2.0: native support for uncertain data. In: SIGMOD Conference, pp. 1239–1242 (2008)
Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2011)
Theobald, M., De Raedt, L., Dylla, M., Kimmig, A., Miliaraki, I.: 10 years of probabilistic querying - what next? In: Catania, B., Guerrini, G., Pokorný, J. (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 1–13. Springer, Heidelberg (2013)
Tresp, V., Huang, Y., Bundschus, M., Rettinger, A.: Materializing and querying learned knowledge. In: First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS 2009) (2009)
Wermser, H., Rettinger, A., Tresp, V.: Modeling and learning context-aware recommendation scenarios using tensor decomposition. In: ASONAM, pp. 137–144 (2011)
Yang, Y., Calmet, J.: Ontobayes: An ontology-driven uncertainty model. In: CIMCA/IAWTIC, pp. 457–463 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Krompaß, D., Nickel, M., Tresp, V. (2014). Querying Factorized Probabilistic Triple Databases. In: Mika, P., et al. The Semantic Web – ISWC 2014. ISWC 2014. Lecture Notes in Computer Science, vol 8797. Springer, Cham. https://doi.org/10.1007/978-3-319-11915-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-11915-1_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11914-4
Online ISBN: 978-3-319-11915-1
eBook Packages: Computer ScienceComputer Science (R0)