Preference-Based Query Answering in Probabilistic Datalog+/–  Ontologies

Abstract

The incorporation of preferences into information systems, such as databases, has recently seen a surge in interest, mainly fueled by the revolution in Web data availability. Modeling the preferences of a user on the Web has also increasingly become appealing to many companies since the explosion of popularity of social media. The other surge in interest is in modeling uncertainty in these domains, since uncertainty can arise due to many uncontrollable factors. In this paper, we propose an extension of the Datalog+/– family of ontology languages with two models: one representing user preferences and one representing the (probabilistic) uncertainty with which inferences are made. Assuming that more probable answers are in general more preferable, one asks how to rank answers to a user’s queries, since the preference model may be in conflict with the preferences induced by the probabilistic model—the need thus arises for preference combination operators. We propose four specific operators and study their semantic and computational properties. We also provide an algorithm for ranking answers based on the iteration of the well-known skyline answers to a query and show that, under certain conditions, it runs in polynomial time in the data complexity. Furthermore, we report on an implementation and experimental results.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Notes

  1. 1.

    http://www.imdb.com.

  2. 2.

    http://jgrapht.org/.

References

  1. 1.

    Atallah MJ, Qi Y (2009) Computing all skyline probabilities for uncertain data. In: Proceedings of PODS. ACM Press, New York, pp 279–287

  2. 2.

    Beeri C, Vardi MY (1987) The implication problem for data dependencies. In: Proceedings of ICALP. Springer, Berlin, pp 73–85

  3. 3.

    Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43

    Article  Google Scholar 

  4. 4.

    Börzsönyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of ICDE. IEEE Computer Society, Los Alamitos, pp 421–430

  5. 5.

    Calì A, Gottlob G, Kifer M (2008) Taming the infinite chase: query answering under expressive relational constraints. In: Proceedings of KR. AAAI Press, Menlo Park, pp 70–80

  6. 6.

    Calì A, Gottlob G, Lukasiewicz T (2012) A general Datalog-based framework for tractable query answering over ontologies. J Web Sem 14:57–83

    Article  Google Scholar 

  7. 7.

    Chomicki J (2003) Preference formulas in relational queries. ACM Trans Database Syst 28(4):427–466

    Article  Google Scholar 

  8. 8.

    Chomicki J (2007) Database querying under changing preferences. Ann Math Artif Intell 50(1/2):79–109

    Article  MATH  MathSciNet  Google Scholar 

  9. 9.

    Domingos P, Webb WA (2012) A tractable first-order probabilistic logic. In: Proceedings of AAAI. AAAI Press, Menlo Park, pp 1902–1909

  10. 10.

    Finger M, Wassermann R, Cozman FG (2011) Satisfiability in \(\cal EL\) with sets of probabilistic ABoxes. In: Proceedings of DL

  11. 11.

    Gaertner W (2009) A primer in social choice theory: revised edition. Oxford University Press, Oxford

  12. 12.

    Gottlob G, Lukasiewicz T, Martinez MV, Simari GI (2013) Query answering under probabilistic uncertainty in Datalog+/- ontologies. Ann Math Artif Intell 69(1):37–72

    Article  MATH  MathSciNet  Google Scholar 

  13. 13.

    Gottlob G, Orsi G, Pieris A (2011) Ontological queries: Rewriting and optimization. In: Proceedings of ICDE. IEEE Computer Society, Washington, DC, pp 2–13

  14. 14.

    Govindarajan K, Jayaraman B, Mantha S (1995) Preference logic programming. In: Proceedings of ICLP. MIT Press, Cambridge, pp 731–745

  15. 15.

    Govindarajan K, Jayaraman B, Mantha S (2001) Preference queries in deductive databases. New Generat Comput 19(1):57–86

    Article  MATH  Google Scholar 

  16. 16.

    Hansson SO (1995) Changes in preference. Theory Decis 38:1–28

    Article  MATH  MathSciNet  Google Scholar 

  17. 17.

    Jung JC, Lutz C (2012) Ontology-based access to probabilistic data with OWL QL. In: Proceedings of ISWC. Springer, Berlin, pp 182–197

  18. 18.

    Kim JH, Pearl J (1983) A computational model for causal and diagnostic reasoning in inference systems. In: Proceedings of IJCAI. William Kaufmann, Karlsruhe, pp 190–193

  19. 19.

    Lacroix M, Lavency P (1987) Preferences: putting more knowledge into queries. In: Proceedings of VLDB. Morgan Kaufmann, Burlington, pp 1–4

  20. 20.

    Lin X, Zhang Y, Zhang W, Cheema MA (2011) Stochastic skyline operator. In: Proceedings of ICDE. IEEE Computer Society, pp 721–732

  21. 21.

    Lukasiewicz T, Martinez MV, Orsi G, Simari GI (2012) Heuristic ranking in tightly coupled probabilistic description logics. In: Proceedings of UAI. AUAI, Edinburgh, pp 554–563

  22. 22.

    Lukasiewicz T, Martinez MV, Simari GI (2012) Consistent answers in probabilistic Datalog+/- ontologies. In: Proceedings of RR. Springer, Berlin, pp 156–171

  23. 23.

    Lukasiewicz T, Martinez MV, Simari GI (2013) Preference-based query answering in Datalog+/- ontologies. In: Proceedings of IJCAI. AAAI Press / IJCAI, Menlo Park, pp 1017–1023

  24. 24.

    Lukasiewicz T, Martinez MV, Simari GI (2013) Preference-based query answering in probabilistic Datalog+/- ontologies. In: Proceedings of ODBASE. Springer, Berlin, pp 501–518

  25. 25.

    Noessner J, Niepert M (2011) ELOG: A probabilistic reasoner for OWL EL. In: Proceedings of RR. Springer, Berlin, pp 281–286

  26. 26.

    Pei J, Jiang B, Lin X, Yuan Y (2007) Probabilistic skylines on uncertain data. In: Proceedings of VLDB. ACM Press, New York, pp 15–26

  27. 27.

    Pini MS, Rossi F, Venable KB, Walsh T (2009) Aggregating partially ordered preferences. J Log Comput 19(3):475–502

    Article  MATH  MathSciNet  Google Scholar 

  28. 28.

    Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62(1/2):107–136

    Article  Google Scholar 

  29. 29.

    Soliman MA, Ilyas IF, Chen-Chuan Chang K (2007) Top-k query processing in uncertain databases. In: Proceedings of ICDE. IEEE Computer Society, pp 896–905

  30. 30.

    Stefanidis K, Koutrika G, Pitoura E (2011) A survey on representation, composition and application of preferences in database systems. ACM Trans Database Syst 36(3):19:1–19:45

  31. 31.

    Warren HS Jr (1975) A modification of Warshall’s algorithm for the transitive closure of binary relations. Commun ACM 18(4):218–220

    Article  MathSciNet  Google Scholar 

  32. 32.

    Warshall S (1962) A theorem on Boolean matrices. J ACM 9(1):11–12

    Article  MATH  MathSciNet  Google Scholar 

  33. 33.

    Zhang X (2010) Probabilities and sets in preference querying. Ph.D. thesis, University at Buffalo, State University of New York

  34. 34.

    Zhang X, Chomicki J (2009) Semantics and evaluation of top-k queries in probabilistic databases. Distrib Parallel Dat 26:67–126

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the EPSRC grant EP/J008346/1 “PrOQAW: Probabilistic Ontological Query Answering on the Web”, by the European Research Council (FP7/2007–2013/ERC) grant 246858 “DIADEM”, by a Google European Doctoral Fellowship, and by a Yahoo! Research Fellowship. We are grateful to the reviewers of this paper and of its ODBASE-2013 preliminary version [24] for their useful feedback, as well as to Giorgio Orsi for his help with the Datalog+/– query answering engine.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Thomas Lukasiewicz.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lukasiewicz, T., Martinez, M.V., Simari, G.I. et al. Preference-Based Query Answering in Probabilistic Datalog+/–  Ontologies. J Data Semant 4, 81–101 (2015). https://doi.org/10.1007/s13740-014-0040-x

Download citation

Keywords

  • Datalog
  • Skyline Answers
  • Tuple-generating Dependencies (TGDs)
  • Boolean CQ (BCQ)
  • IMDB Dataset