Abstract
With the rapid proliferation of XML data, large-scale online applications are emerging. In this research, we aim to enhance the XML query processors with the ability to process queries progressively and report partial results and query progress continually. The methodology lays its foundation on sampling. We shed light on how effective samples can be drawn from semi-structured XML data, as opposed to flat-table relational data. Several innovative sampling schemes on XML data are designed. The proposed methodology advances XML query processing to the next level - being more flexible, responsive, user-informed, and user-controllable, to meet emerging needs and future challenges.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal xml pattern matching. In: Proceedings of the 2002 ACM SIGMOD international conf. on Management of data, pp. 310–321 (2002)
Chaudhuri, S., Narasayya, V., Ramamurthy, R.: Estimating progress of long running sql queries. In: Proc. ACM SIGMOD Conf., pp. 803–814 (2004)
Cochran, W.G.: Sampling Techniques. Wiley, Chichester (1977)
Ganguly, S., Gibbons, P., Matias, Y., Silberschatz, A.: Bifocal sampling for skew-resistant join size estimation. In: Proceedings of the 1996 ACM SIGMOD international conf. on Management of data, pp. 271–281 (1996)
Hass, P., Naughton, J., Seshadri, S., Stokes, L.: Sampling-based estimation of the number of distinct values of an attribute. In: Proc. 21st Intl. Conf. on Very Large Data Bases, pp. 311–322 (1995)
Hellerstein, J., Haas, P., Wang, H.: Online aggregation. In: Proc. ACM SIGMOD Conf., pp. 171–182 (1997)
Hoeffding, W.: Probability inequality for sums of bounded random variables. Journal of Amer. Statist. Assoc. (58), 13–30 (1964)
Hou, W.-C., Ozsoyoglu, G., Taneja, B.: Statistical estimators for relational algebra expression. In: Proc. 7th ACM Symp. on Principles of Database Systems, pp. 276–287 (1988)
Hou, W.-C., Ozsoyoglu, G., Taneja, B.: Processing aggregate relational queries with hard time constraints. In: Proc. ACM SIGMOD International Conf. on Management of Data, pp. 68–77 (1989)
Jermain, C., Dobra, A., Arumugam, S., Jashi, S., Pol, A.: A disk-based join with probabilistic guarantees. In: Proc. ACM SIGMOD Conf., pp. 563–574 (2005)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: Xr-tree: Indexing xml data for efficient structural joins. In: ICDE, pp. 253–264 (2003)
Jiang, Z., Luo, C., Hou, W., Ozsoyoglu, G.: Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator (Technical Report), http://www.cs.siu.edu/~zjiang/dexa08.pdf
Lipton, R.J., Naughton, J.F., Schneider, D.A.: Practical selectivity estimation through adaptive sampling. In: Proceedings 1990 ACM SIGMOD Intl. Conf. Managment of Data, pp. 1–11 (1990)
Lu, J., Chen, T., Ling, T.W.: Efficient processing of xml twig patterns with parent child edges: A look-ahead approach. In: Proceedings of CIKM, pp. 533–542 (2004)
Lu, J., Ling, T.W., Chan, C.-Y., Chen, T.: From region encoding to extended dewey: on efficient processing of xml twig pattern matching. In: Proceedings of the 31st international conf. on very large data bases (2005)
Luo, G., Naughton, J., Ellmann, C., Watzke, M.: Toward a progress indicator for database queries. In: Proc. ACM SIGMOD Conf., pp. 791–802 (2004)
Luo, G., Naughton, J., Ellmann, C., Watzke, M.: Increasing the accuracy and coverage of sql progress indicators. In: Proc. ACM SIGMOD Conf., pp. 853–864 (2005)
Ross, S.: Introduction to Probability Models, 2nd edn. Academic Press, London (1980)
Tan, K., Goh, C., Ooi, B.: Progressive evaluation of nested aggregate queries. VLDB Journal (9), 261–278 (2000)
Wang, W., Jiang, H., Lu, H., Yu, J.X.: Containment join size estimation: models and methods. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 358–369 (2003)
Zhang, N., Ozsu, M.T., Aboulnaga, A., Ilyas, I.F.: Xseed: Accurate and fast cardinality estimation for xpath queries. In: Proc. 22nd Intl. Conf. on Data Engineering (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, C., Jiang, Z., Hou, WC., Ozsoyoglu, G. (2009). Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)