Counting Graph Matches with Adaptive Statistics Collection

Feng, Jianhua; Qian, Qian; Liao, Yuguo; Zhou, Lizhu

doi:10.1007/11775300_38

Counting Graph Matches with Adaptive Statistics Collection

Jianhua Feng¹⁹,
Qian Qian¹⁹,
Yuguo Liao¹⁹ &
…
Lizhu Zhou¹⁹

Conference paper

1191 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4016))

Abstract

High performance of query processing in large scale graph-structured data poses a pressing demand for high-quality statistics collection and selectivity estimation. Precise and succinct statistics collection about graph-structured data plays a crucial role for graph query selectivity estimation. In this paper, we propose the approach SMT, Succinct Markov Table, which achieves high precision in selectivity estimation with low memory space consumed. Four core notions of SMT are constructing, refining, compressing and estimating. The efficient algorithm SMTBuilder provides facility to build adaptive statistics model in the form of SMT. Versatile optimization rules, which investigate local bi-directional reachability, are introduced in SMT refining. During compressing, affective SMT grouping techniques are introduced. Statistical methods are used for selectivity estimations of various graph queries basing on SMT, especially for twig queries. By a thorough experimental study, we demonstrate SMT’s advantages in accuracy and space by comparing with previously known alternative, as well as the preferred optimization rules and compressing technique that would favor different real-life data.

The work was supported by the National Natural Science Foundation of China under Grant No.60573094, Tsinghua Basic Research Foundation under Grant No.JCqn2005022 and Zhejiang Natural Science Foundation under Grant No.Y105230.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aboulnaga, A., Alameldeen, A.R., Naughton, J.F.: Estimating the selectivity of XML path expressions for internet scale applications. In: VLDB 2001 (2001)
Google Scholar
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E.: Extensible Markup Language (XML) 1.0, 2nd edn. W3C Recommendation (October 2000)
Google Scholar
Chamberlin, D., Clark, J., Florescu, D., Robie, J., Simeon, J., Stefanescu, M.: XQuery 1.0: An XML query language. W3C Working Draft, June 7 (2001)
Google Scholar
Chen, Q., Lim, A., Ong, K.W.: D(k)-index: An adaptive structural summary for graphstructured data. In: SIGMOD 2003 (2003)
Google Scholar
Chen, Z., Jagadish, H.V., Korn, F., Koudas, N., Muthukrishnan, S., Ng, R., Srivastava, D.: Counting twig matches in a tree. In: ICDE 2001 (2001)
Google Scholar
Jianhua Feng, Qian Qian, Yuguo Liao, Guoliang Li, Na Ta. DMT: A flexible and versatile selectivity estimation approach for graph query. In: WAIM 2005 (2005)
Google Scholar
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting Local Similarity for Efficient Indexing of Paths in Graph Structured Data. In: ICDE 2002 (2002)
Google Scholar
Lim, L., Wang, M., Padmanabhan, S., Vitter, J., Parr, R.: XPathLearner: An On-Ling Self- Tuning Markov Histogram for XML Path Selectivity Estimation. In: VLDB 2002 (2002)
Google Scholar
Ley, M.: DBLP XML records (2001)
Google Scholar
Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Chapter Google Scholar
Polyzotis, N., Garofalakis, M.: Statistical Synopses for Graph-Structured XML Databases.In: SIGMOD 2002 (2002)
Google Scholar
Shakespeare dataset, http://www.cs.kuleuven.ac.be/~ml/ie/
XMARK: The XML-benchmark project (2002), http://monetdb.cwi.nl/xml

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Jianhua Feng, Qian Qian, Yuguo Liao & Lizhu Zhou

Authors

Jianhua Feng
View author publications
You can also search for this author in PubMed Google Scholar
Qian Qian
View author publications
You can also search for this author in PubMed Google Scholar
Yuguo Liao
View author publications
You can also search for this author in PubMed Google Scholar
Lizhu Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Chinese University of Hong Kong, Hong Kong, China
Jeffrey Xu Yu
Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
Masaru Kitsuregawa
Department of Computing, Hong Kong Polytechnic University, Hong Kong
Hong Va Leong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, J., Qian, Q., Liao, Y., Zhou, L. (2006). Counting Graph Matches with Adaptive Statistics Collection. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds) Advances in Web-Age Information Management. WAIM 2006. Lecture Notes in Computer Science, vol 4016. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11775300_38

Download citation

DOI: https://doi.org/10.1007/11775300_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35225-9
Online ISBN: 978-3-540-35226-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics