Advertisement

Keywords Filtering over Probabilistic XML Data

  • Chenjing Zhang
  • Le Chang
  • Chaofeng Sha
  • Xiaoling Wang
  • Aoying Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7235)

Abstract

Probabilistic XML data is widely used in many web applications. Recent work has been mostly focused on structured query over probabilistic XML data. A few of work has been done about keyword query. However only the independent and the mutually-exclusive relationship among sibling nodes are discussed. This paper addresses the problem of keyword filtering over probabilistic XML data, and we propose PrXML{exp, ind, mux} model to represent a more general relationship among XML sibling nodes, for keywords filtering over probabilistic XML data. kdptab is defined as keyword distribution probability table of one subtree. The Dot product, Cartesian product, and addition operation of kdptab are also defined. In PrXML{exp, ind, mux} model, XML document is scanned bottom-up and achieve keyword filtering based on SLCA semantics efficiently in our method. Finally, the features and efficiency of our method are evaluated with extensive experimental results.

Keywords

Probabilistic XML Keywords Filtering SLCA 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Senellart, P., Abiteboul, S.: On the complexity of managing probabilistic xml data. In: PODS, pp. 283–292 (2007)Google Scholar
  2. 2.
    Nierman, A., Jagadish, H.V.: Protdb: Probabilistic data in xml. In: VLDB, pp. 646–657 (2002)Google Scholar
  3. 3.
    van Keulen, M., de Keijzer, A., Alink, W.: A probabilistic xml approach to data integration. In: ICDE, pp. 459–470 (2005)Google Scholar
  4. 4.
    Abiteboul, S., Senellart, P.: Querying and Updating Probabilistic Information in XML. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 1059–1068. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Abiteboul, S., Kimelfeld, B., Sagiv, Y., Senellart, P.: On the expressiveness of probabilistic xml models. VLDB J. 18(5), 1041–1064 (2009)CrossRefGoogle Scholar
  6. 6.
    Hung, E., Getoor, L., Subrahmanian, V.S.: Probabilistic Interval XML. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 358–374. Springer, Heidelberg (2002)Google Scholar
  7. 7.
    Kimelfeld, B., Kosharovsky, Y., Sagiv, Y.: Query efficiency in probabilistic xml models. In: SIGMOD Conference, pp. 701–714 (2008)Google Scholar
  8. 8.
    Chang, L., Yu, J.X., Qin, L.: Query ranking in probabilistic xml data. In: EDBT, pp. 156–167 (2009)Google Scholar
  9. 9.
    Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic xml data. In: ICDE, pp. 673–684 (2011)Google Scholar
  10. 10.
    Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest lcas in xml databases. In: SIGMOD Conference, pp. 537–538 (2005)Google Scholar
  11. 11.
    Sun, C., Chan, C.Y., Goenka, A.K.: Multiway slca-based keyword search in xml data. In: WWW, pp. 1043–1052 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Chenjing Zhang
    • 1
    • 2
  • Le Chang
    • 3
  • Chaofeng Sha
    • 2
  • Xiaoling Wang
    • 3
  • Aoying Zhou
    • 3
  1. 1.College of Information TechnologyShanghai Ocean UniversityChina
  2. 2.School of Computer ScienceFudan UniversityChina
  3. 3.Shanghai Key Laboratory of Trustworthy Computing, Software Engineering InstituteEast China Normal UniversityChina

Personalised recommendations