Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Probabilistic Databases

  • Dan Suciu
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_275

Synonyms

Probabilistic databases extend standard databases with probabilities, in order to model uncertainties in data; Query evaluation becomes probabilistic inference

Definition

A probabilistic database is a database in which every tuple t belongs to the database with some probability P(t); when P(t) = 1 then the tuple is certain to belong to the database; when 0 < P(t) < 1 then it belongs to the database only with some probability; when P(t) = 0 then the tuple is certain not to belong to the database, and we usually don’t even bother representing it. A traditional (deterministic) database corresponds to the case when P(t) = 1 for all tuples t. Tuples with P(t) > 0 are called possible tuples. In addition to indicating the probabilities for all tuples, a probabilistic database must also indicate somehow how the tuples are correlated. In the simplest cases the tuples are declared to be either independent (when P(t1t2) = P(t1t2)), or exclusive (or disjoint, when P(t1t2) = 0).

Historical...

This is a preview of subscription content, log in to check access.

Notes

References

  1. 1.
    Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Trans Knowl Data Eng. 1992;4(5):487–502.CrossRefGoogle Scholar
  2. 2.
    Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER Jr, Mitchell TM. Toward an architecture for never-ending language learning. In: Proceedings of the 24th National Conference on Artificial Intelligence; 2010.Google Scholar
  3. 3.
    Dalvi N, Suciu D. Efficient query evaluation on probabilistic databases. In: Proceedings of the 30th International Conference on Very Large Data Bases; 2004.CrossRefGoogle Scholar
  4. 4.
    Dalvi NN, Suciu D. The dichotomy of probabilistic inference for unions of conjunctive queries. J ACM. 2012;59(6):30.MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Van den Broeck G, Suciu D. Tutorial: Lifted probabilistic inference in relational models. In: Proceedings of the 25th International Joint Conference on AI; 2016.Google Scholar
  6. 6.
    Domingos P, Lowd D. Markov logic: an interface layer for artificial intelligence. Synthesis lectures on artificial intelligence and machine learning. San Rafael: Morgan & Claypool Publishers; 2009.zbMATHGoogle Scholar
  7. 7.
    Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2014.Google Scholar
  8. 8.
    Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. A meeting of SIGDAT, a special interest group of the ACL; 2011. p. 1535–45.Google Scholar
  9. 9.
    Fuhr N, Roelleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans Inf Syst. 1997;15(1):32–66.CrossRefGoogle Scholar
  10. 10.
    Hoffart J, Suchanek FM, Berberich K, Weikum G. Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artif Intell. 2013;194(Jan):28–61.MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
  12. 12.
    Jha AK, Suciu D. Probabilistic databases with markoviews. Proc VLDB Endow. 2012;5(11): 1160–71.CrossRefGoogle Scholar
  13. 13.
    Lakshmanan L, Leone N, Ross R, Subrahmanian VS. Probview: a flexible probabilistic database system. ACM Trans Database Syst. 1997;22(3):419–69.CrossRefGoogle Scholar
  14. 14.
    Suciu D, Olteanu D, Ré C, Koch C. Probabilistic databases. Synthesis lectures on data management. San Rafael: Morgan & Claypool Publishers; 2011.zbMATHGoogle Scholar
  15. 15.
    Wu W, Li H, Wang H, Zhu KQ. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2012. p. 481–92Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA