Querying Factorized Probabilistic Triple Databases

Krompaß, Denis; Nickel, Maximilian; Tresp, Volker

doi:10.1007/978-3-319-11915-1_8

Querying Factorized Probabilistic Triple Databases

Denis Krompaß²⁴,
Maximilian Nickel^25,26 &
Volker Tresp^24,27

Conference paper

2074 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8797))

Abstract

An increasing amount of data is becoming available in the form of large triple stores, with the Semantic Web’s linked open data cloud (LOD) as one of the most prominent examples. Data quality and completeness are key issues in many community-generated data stores, like LOD, which motivates probabilistic and statistical approaches to data representation, reasoning and querying. In this paper we address the issue from the perspective of probabilistic databases, which account for uncertainty in the data via a probability distribution over all database instances. We obtain a highly compressed representation using the recently developed RESCAL approach and demonstrate experimentally that efficient querying can be obtained by exploiting inherent features of RESCAL via sub-query approximations of deterministic views.

Download to read the full chapter text

Chapter PDF

References

Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. In: AAAI (2011)
Google Scholar
Boulos, J., Dalvi, N.N., Mandhani, B., Mathur, S., Ré, C., Suciu, D.: Mystiq: a system for finding more answers by using probabilities. In: SIGMOD Conference, pp. 891–893 (2005)
Google Scholar
Calì, A., Lukasiewicz, T., Predoiu, L., Stuckenschmidt, H.: Tightly integrated probabilistic description logic programs for representing ontology mappings. In: Hartmann, S., Kern-Isberner, G. (eds.) FoIKS 2008. LNCS, vol. 4932, pp. 178–198. Springer, Heidelberg (2008)
Chapter Google Scholar
da Costa, P.C.G., Laskey, K.B., Laskey, K.J.: Pr-owl: A bayesian ontology language for the semantic web. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 88–107. Springer, Heidelberg (2008)
Chapter Google Scholar
Dalvi, N.N., Re, C., Suciu, D.: Queries and materialized views on probabilistic databases. J. Comput. Syst. Sci. 77(3), 473–490 (2011)
Article MathSciNet MATH Google Scholar
Ding, Z., Peng, Y., Pan, R.: A bayesian approach to uncertainty modelling in owl ontology. In: Proceedings of the International Conference on Advances in Intelligent Systems - Theory and Applications (2004)
Google Scholar
Dylla, M., Miliaraki, I., Theobald, M.: Top-k query processing in probabilistic databases with non-materialized views. Research Report MPI-I-2012-5-002, Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany (June 2012)
Google Scholar
Franz, T., Schultz, A., Sizov, S., Staab, S.: Triplerank: Ranking semantic web data by tensor decomposition. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 213–228. Springer, Heidelberg (2009)
Chapter Google Scholar
Giugno, R., Lukasiewicz, T.: P-\(\mathcal{SHOQ}({\bf D})\): A probabilistic extension of \(\mathcal{SHOQ}({\bf D})\) for probabilistic ontologies in the semantic web. In: Flesca, S., Greco, S., Leone, N., Ianni, G. (eds.) JELIA 2002. LNCS (LNAI), vol. 2424, pp. 86–97. Springer, Heidelberg (2002)
Chapter Google Scholar
Huang, J., Antova, L., Koch, C., Olteanu, D.: Maybms: a probabilistic database management system. In: SIGMOD Conference (2009)
Google Scholar
Jenatton, R., Roux, N.L., Bordes, A., Obozinski, G.: A latent factor model for highly multi-relational data. In: NIPS (2012)
Google Scholar
Kolda, T.G., Bader, B.W., Kenny, J.P.: Higher-order web link analysis using multilinear algebra. In: ICDM, pp. 242–249 (2005)
Google Scholar
Laub, A.J.: Matrix analysis - for scientists and engineers. SIAM (2005)
Google Scholar
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal (2014)
Google Scholar
Lukasiewicz, T.: Expressive probabilistic description logics. Artif. Intell. 172(6-7), 852–883 (2008)
Article MathSciNet MATH Google Scholar
Mutsuzaki, M., Theobald, M., de Keijzer, A., Widom, J., Agrawal, P., Benjelloun, O., Sarma, A.D., Murthy, R., Sugihara, T.: Trio-one: Layering uncertainty and lineage on a conventional dbms (demo). In: CIDR, pp. 269–274 (2007)
Google Scholar
Nickel, M.: Tensor factorization for relational learning. PhDThesis, p. 48, 49, 74, Ludwig-Maximilian-University of Munich (August 2013)
Google Scholar
Nickel, M., Tresp, V.: Logistic tensor factorization for multi-relational data. In: Structured Learning: Inferring Graphs from Structured and Unstructured Inputs, ICML WS (2013)
Google Scholar
Nickel, M., Tresp, V., Kriegel, H.-P.: A three-way model for collective learning on multi-relational data. In: ICML, pp. 809–816 (2011)
Google Scholar
Nickel, M., Tresp, V., Kriegel, H.-P.: Factorizing yago: scalable machine learning for linked data. In: WWW, pp. 271–280 (2012)
Google Scholar
Olteanu, D., Wen, H.: Ranking query answers in probabilistic databases: Complexity and efficient algorithms. In: ICDE, pp. 282–293 (2012)
Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)
Google Scholar
Rendle, S., Marinho, L.B., Nanopoulos, A., Schmidt-Thieme, L.: Learning optimal ranking with tensor factorization for tag recommendation. In: KDD, pp. 727–736 (2009)
Google Scholar
Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: HLT-NAACL, pp. 74–84 (2013)
Google Scholar
Christopher, R., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE, pp. 886–895 (2007)
Google Scholar
Singh, S., Mayfield, C., Mittal, S., Prabhakar, S., Hambrusch, S.E., Shah, R.: Orion 2.0: native support for uncertain data. In: SIGMOD Conference, pp. 1239–1242 (2008)
Google Scholar
Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2011)
Google Scholar
Theobald, M., De Raedt, L., Dylla, M., Kimmig, A., Miliaraki, I.: 10 years of probabilistic querying - what next? In: Catania, B., Guerrini, G., Pokorný, J. (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 1–13. Springer, Heidelberg (2013)
Chapter Google Scholar
Tresp, V., Huang, Y., Bundschus, M., Rettinger, A.: Materializing and querying learned knowledge. In: First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS 2009) (2009)
Google Scholar
Wermser, H., Rettinger, A., Tresp, V.: Modeling and learning context-aware recommendation scenarios using tensor decomposition. In: ASONAM, pp. 137–144 (2011)
Google Scholar
Yang, Y., Calmet, J.: Ontobayes: An ontology-driven uncertainty model. In: CIMCA/IAWTIC, pp. 457–463 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Ludwig Maximilian University, 80538, Munich, Germany
Denis Krompaß & Volker Tresp
Massachusetts Institute of Technology, Cambridge, MA, USA
Maximilian Nickel
Istituto Italiano di Tecnologia, Genova, Italy
Maximilian Nickel
Siemens AG, Corporate Technology, Munich, Germany
Volker Tresp

Authors

Denis Krompaß
View author publications
You can also search for this author in PubMed Google Scholar
Maximilian Nickel
View author publications
You can also search for this author in PubMed Google Scholar
Volker Tresp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Yahoo Labs, Diagonal 177, 08018, Barcelona, Spain
Peter Mika
Stanford University, 1265 Welch Road, 94305, Stanford, CA, USA
Tania Tudorache
DDIS, University of Zurich, Zurich, Switzerland
Abraham Bernstein
IBM Research, Yorktown Heights, NY, USA
Chris Welty
Information Sciences Institute and Department of Computer Science, University of Southern California, Los Angeles, CA, USA
Craig Knoblock
Google, USA
Denny Vrandečić & Natasha Noy &
VU University Amsterdam, The Netherlands
Paul Groth
Department of Geography, University of California, Santa Barbara, CA, USA
Krzysztof Janowicz
School of Computer Science, The University of Manchester, Manchester, UK
Carole Goble

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krompaß, D., Nickel, M., Tresp, V. (2014). Querying Factorized Probabilistic Triple Databases. In: Mika, P., et al. The Semantic Web – ISWC 2014. ISWC 2014. Lecture Notes in Computer Science, vol 8797. Springer, Cham. https://doi.org/10.1007/978-3-319-11915-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-11915-1_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11914-4
Online ISBN: 978-3-319-11915-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics