Querying and Updating Probabilistic Information in XML

  • Serge Abiteboul
  • Pierre Senellart
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)

Abstract

We present in this paper a new model for representing probabilistic information in a semi-structured (XML) database, based on the use of probabilistic event variables. This work is motivated by the need of keeping track of both confidence and lineage of the information stored in a semi-structured warehouse. For instance, the modules of a (Hidden Web) content warehouse may derive information concerning the semantics of discovered Web services that is by nature not certain. Our model, namely the fuzzy tree model, supports both querying (tree pattern queries with join) and updating (transactions containing an arbitrary set of insertions and deletions) over probabilistic tree data. We highlight its expressive power and discuss implementation issues.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Nguyen, B., Ruberg, G.: Building an active content warehouse. In: Processing and Managing Complex Data for Decision Support, Idea Group Publishing, USA (2005)Google Scholar
  2. 2.
    Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: Very Large Data Bases, Hong Kong, China, pp. 864–875 (2004)Google Scholar
  3. 3.
    de Rougemont, M.: The reliability of queries. In: Principles Of Database Systems, San Jose, United States, pp. 286–291 (1995)Google Scholar
  4. 4.
    Imieliński, T., Lipski, W.: Incomplete information in relational databases. J. ACM 31, 761–791 (1984)MATHCrossRefGoogle Scholar
  5. 5.
    Abiteboul, S., Grahne, G.: Update semantics for incomplete databases. In: Very Large Data Bases, Stockholm, Sweden (1985)Google Scholar
  6. 6.
    Abiteboul, S., Senellart, P.: Querying and updating probabilistic information in XML. Technical Report 435, GEMO, Inria Futurs, Orsay, France (2005)Google Scholar
  7. 7.
    BrightPlanet: The Deep Web: Surfacing hidden value. White Paper (2000)Google Scholar
  8. 8.
    Franc, X.: Qizx/open (2005), http://www.xfra.net/qizxopen/
  9. 9.
    Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path summaries and path partitioning in modern XML databases. Technical Report 437, Gemo (2005)Google Scholar
  10. 10.
    Cavallo, R., Pittarelli, M.: The theory of probabilistic databases. In: Very Large Data Bases, pp. 71–81 (1987)Google Scholar
  11. 11.
    Barbará, D., Garcia-Molina, H., Porter, D.: The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering 4, 487–502 (1992)CrossRefGoogle Scholar
  12. 12.
    Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 15 (1997)Google Scholar
  13. 13.
    Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. In: Biennal Conference on Innovative Data Systems Research, Pacific Grove, USA (2005)Google Scholar
  14. 14.
    Dekhtyar, A., Goldsmith, J., Hawkes, S.R.: Semistructured probabilistic databases. In: Statistical and Scientific Database Management, Tokyo, Japan, pp. 36–45 (2001)Google Scholar
  15. 15.
    Nierman, A., Jagadish, H.V.: ProTDB: Probabilistic data in XML. In: Very Large Data Bases, Hong Kong, China (2002)Google Scholar
  16. 16.
    Hung, E., Getoor, L., Subrahmanian, V.S.: PXML: A probabilistic semistructured data model and algebra. In: International Conference on Data Engineering, Bangalore, India, pp. 467–478 (2003)Google Scholar
  17. 17.
    van Keulen, M., de Keijzer, A., Alink, W.: A probabilistic XML approach to data integration. In: International Conference on Data Engineering, pp. 459–470 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Serge Abiteboul
    • 1
  • Pierre Senellart
    • 1
  1. 1.INRIA Futurs & LRIUniversité Paris-SudFrance

Personalised recommendations