Abstract
Probabilistic XML data is widely used in many web applications. Recent work has been mostly focused on structured query over probabilistic XML data. A few of work has been done about keyword query. However only the independent and the mutually-exclusive relationship among sibling nodes are discussed. This paper addresses the problem of keyword filtering over probabilistic XML data, and we propose PrXML{exp, ind, mux} model to represent a more general relationship among XML sibling nodes, for keywords filtering over probabilistic XML data. kdptab is defined as keyword distribution probability table of one subtree. The Dot product, Cartesian product, and addition operation of kdptab are also defined. In PrXML{exp, ind, mux} model, XML document is scanned bottom-up and achieve keyword filtering based on SLCA semantics efficiently in our method. Finally, the features and efficiency of our method are evaluated with extensive experimental results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Senellart, P., Abiteboul, S.: On the complexity of managing probabilistic xml data. In: PODS, pp. 283–292 (2007)
Nierman, A., Jagadish, H.V.: Protdb: Probabilistic data in xml. In: VLDB, pp. 646–657 (2002)
van Keulen, M., de Keijzer, A., Alink, W.: A probabilistic xml approach to data integration. In: ICDE, pp. 459–470 (2005)
Abiteboul, S., Senellart, P.: Querying and Updating Probabilistic Information in XML. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 1059–1068. Springer, Heidelberg (2006)
Abiteboul, S., Kimelfeld, B., Sagiv, Y., Senellart, P.: On the expressiveness of probabilistic xml models. VLDB J. 18(5), 1041–1064 (2009)
Hung, E., Getoor, L., Subrahmanian, V.S.: Probabilistic Interval XML. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 358–374. Springer, Heidelberg (2002)
Kimelfeld, B., Kosharovsky, Y., Sagiv, Y.: Query efficiency in probabilistic xml models. In: SIGMOD Conference, pp. 701–714 (2008)
Chang, L., Yu, J.X., Qin, L.: Query ranking in probabilistic xml data. In: EDBT, pp. 156–167 (2009)
Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic xml data. In: ICDE, pp. 673–684 (2011)
Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest lcas in xml databases. In: SIGMOD Conference, pp. 537–538 (2005)
Sun, C., Chan, C.Y., Goenka, A.K.: Multiway slca-based keyword search in xml data. In: WWW, pp. 1043–1052 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, C., Chang, L., Sha, C., Wang, X., Zhou, A. (2012). Keywords Filtering over Probabilistic XML Data. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds) Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7235. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29253-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-29253-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29252-1
Online ISBN: 978-3-642-29253-8
eBook Packages: Computer ScienceComputer Science (R0)