Advertisement

Ontology-Based Access to Probabilistic Data with OWL QL

  • Jean Christoph Jung
  • Carsten Lutz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7649)

Abstract

We propose a framework for querying probabilistic instance data in the presence of an OWL2 QL ontology, arguing that the interplay of probabilities and ontologies is fruitful in many applications such as managing data that was extracted from the web. The prime inference problem is computing answer probabilities, and it can be implemented using standard probabilistic database systems. We establish a PTime vs. #P dichotomy for the data complexity of this problem by lifting a corresponding result from probabilistic databases. We also demonstrate that query rewriting (backwards chaining) is an important tool for our framework, show that non-existence of a rewriting into first-order logic implies #P-hardness, and briefly discuss approximation of answer probabilities.

References

  1. 1.
    Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: Proc. of ICDE, pp. 983–992 (2008)Google Scholar
  2. 2.
    Antova, L., Koch, C., Olteanu, D.: \(10^{10^6}\) worlds and beyond: efficient representation and processing of incomplete information. VLDB J. 18(5), 1021–1040 (2009)CrossRefGoogle Scholar
  3. 3.
    Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook. Cambridge University Press (2003)Google Scholar
  4. 4.
    Bienvenu, M., Lutz, C., Wolter, F.: Query containment in description logics reconsidered. In: Proc. of KR (2012)Google Scholar
  5. 5.
    Boulos, J., Dalvi, N.N., Mandhani, B., Mathur, S., Ré, C., Suciu, D.: MYSTIQ: a system for finding more answers by using probabilities. In: Proc. of SIGMOD, pp. 891–893 (2005)Google Scholar
  6. 6.
    Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning 39(3), 385–429 (2007)zbMATHCrossRefGoogle Scholar
  7. 7.
    Dalvi, N.N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86–94 (2009)CrossRefGoogle Scholar
  8. 8.
    Dalvi, N.N., Schnaitter, K., Suciu, D.: Computing query probability with incidence algebras. In: Proc. of PODS, pp. 203–214. ACM (2010)Google Scholar
  9. 9.
    Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB J. 16(4), 523–544 (2007)CrossRefGoogle Scholar
  10. 10.
    Dalvi, N.N, Suciu, D.: The Dichotomy of Probabilistic Inference for Unions of Conjunctive Queries. Submitted to Journal of the ACMGoogle Scholar
  11. 11.
    Finger, M., Wassermann, R., Cozman, F.G.: Satisfiability in \({\mathcal EL}\) with sets of probabilistic ABoxes. In: Proc. of DL. CEUR-WS, vol. 745 (2011)Google Scholar
  12. 12.
    Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 15(1), 32–66 (1997)CrossRefGoogle Scholar
  13. 13.
    Furche, T., Gottlob, G., Grasso, G., Gunes, O., Guo, X., Kravchenko, A., Orsi, G., Schallhart, C., Sellers, A.J., Wang, C.: Diadem: domain-centric, intelligent, automated data extraction methodology. In: Proc. of WWW, pp. 267–270. ACM (2012)Google Scholar
  14. 14.
    Gottlob, G., Lukasiewicz, T., Simari, G.I.: Conjunctive Query Answering in Probabilistic Datalog+/– Ontologies. In: Rudolph, S., Gutierrez, C. (eds.) RR 2011. LNCS, vol. 6902, pp. 77–92. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Green, T.J., Tannen, V.: Models for incomplete and probabilistic information. IEEE Data Engineering Bulletin 29(1), 17–24 (2006)Google Scholar
  16. 16.
    Gupta, R., Sarawagi, S.: Creating probabilistic databases from information extraction models. In: Proc. of VLDB, pp. 965–976. ACM (2006)Google Scholar
  17. 17.
    Halpern, J.Y.: An analysis of first-order logics of probability. Artif. Intell. 46(3), 311–350 (1990)zbMATHCrossRefGoogle Scholar
  18. 18.
    Imielinski, T., Lipski Jr., W.: Incomplete information in relational databases. J. of the ACM 31(4), 761–791 (1984)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Jerrum, M., Valiant, L.G., Vazirani, V.V.: Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci. 43, 169–188 (1986)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Karger, D.R.: A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM J. Comput. 29(2), 492–514 (1999)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Karp, R.M., Luby, M.: Monte-carlo algorithms for enumeration and reliability problems. In: Proc. of FoCS, pp. 56–64. IEEE Computer Society (1983)Google Scholar
  22. 22.
    Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The combined approach to query answering in DL-Lite. In: Proc. of KR. AAAI Press (2010)Google Scholar
  23. 23.
    Laender, A.H.F., Ribeiro-Neto, B.A., da Silva, A.S., Teixeira, J.S.: A brief survey of web data extraction tools. SIGMOD Record 31(2), 84–93 (2002)CrossRefGoogle Scholar
  24. 24.
    Lukasiewicz, T., Straccia, U.: Managing uncertainty and vagueness in description logics for the semantic web. J. Web Sem. 6(4), 291–308 (2008)CrossRefGoogle Scholar
  25. 25.
    Lutz, C., Schröder, L.: Probabilistic description logics for subjective uncertainty. In Proc. of KR. AAAI Press (2010)Google Scholar
  26. 26.
    Lutz, C., Wolter, F.: Non-uniform data complexity of query answering in description logics. In: Proc. of KR. AAAI Press (2012)Google Scholar
  27. 27.
    Raedt, L.D., Kimmig, A., Toivonen, H.: Problog: a probabilistic prolog and its application in link discovery. In: Proc. of IJCAI, pp. 2468–2473. AAAI Press (2007)Google Scholar
  28. 28.
    Rossmann, B.: Homomorphism preservation theorems. J. ACM 55(3), 1–54 (2008)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: Proc. of ICDE. IEEE Computer Society (2006)Google Scholar
  30. 30.
    Straccia, U.: Top-k retrieval for ontology mediated access to relational databases. Information Sciences 108, 1–23 (2012)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2011)Google Scholar
  32. 32.
    Valiant, L.G.: The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3), 410–421 (1979)MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. In: Proc. of CIDR, pp. 262–276 (2005)Google Scholar
  34. 34.
    Zenklusen, R., Laumanns, M.: High-confidence estimation of small s-t reliabilities in directed acyclic networks. Networks 57(4), 376–388 (2011)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jean Christoph Jung
    • 1
  • Carsten Lutz
    • 1
  1. 1.Universität BremenGermany

Personalised recommendations