A Resource Efficient Hybrid Data Structure for Twig Queries

  • John Wilson
  • Richard Gourlay
  • Robert Japp
  • Mathias Neumüller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4156)


Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semistructured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This paper introduces an integration of the two approaches using numbering schemes to connect the separate elements, the key strength of this hybrid technique is that both structural and value predicates can be resolved in one graph, while further allowing for compression of the resulting data structure. Performance measures that show advantages of using this hybrid structure are presented, together with an analysis of query resolution using a number of different index granularities. As the current trend is towards the requirement for working with larger semi-structured data sets this work allows for the utilisation of these data sets whilst reducing both the bandwidth and storage space necessary.


Data Graph Numbering Scheme Path Query Inverted List Semistructured Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amato, G., Debole, F., Zezula, P., Rabitti, F.: YAPI: Yet another path index for XML searching. In: Proc of ECDL, pp. 176–187 (2003)Google Scholar
  2. 2.
    Arion, A., Bonifati, A., Costa, G., D’Aguanno, S., Manolescu, I., Pugliese, A.: Xquec: Pushing queries to compressed XML data. In: VLDB, pp. 1065–1068 (2003)Google Scholar
  3. 3.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proc. of SIGMOD, pp. 310–321 (2002)Google Scholar
  4. 4.
    Buneman, P., Grohe, M., Koch, C.: Path queries on compressed XML. In: Proc. of VLDB, pp. 141–152 (2003)Google Scholar
  5. 5.
    Buneman, P., Choi, B., Fan, W., Hutchison, R., Mann, R., Viglas, S.D.: Vectorizing and querying large XML repositories. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, pp. 261–272. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  6. 6.
    Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding structure to unstructured data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1996)Google Scholar
  7. 7.
    Chen, Q., Lim, A., Ong, K.W.: D(k)-index: an adaptive structural summary for graph-structured data. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 134–144. ACM Press, New York (2003)CrossRefGoogle Scholar
  8. 8.
    Cheney, J.: Compressing XML with multiplexed hierarchical ppm models. In: Data Compression Conference, p. 163 (2001)Google Scholar
  9. 9.
    Cheney, J.: Tradeoffs in XML database compression. In: DCC, pp. 392–401 (2006)Google Scholar
  10. 10.
    Cheng, J., Ng, W.: Xqzip: Querying compressed XML using structural indexing. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 219–236. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Chien, S.Y., Vagena, Z., et al.: Efficient structural joins on indexed XML documents. In: Proc. of VLDB, pp. 263–274 (2002)Google Scholar
  12. 12.
    Choi, B., Buneman, P.: XML vectorization: A column-based XML storage model. Technical Report MS-CIS-03-17, University of Pennsylvania (2003)Google Scholar
  13. 13.
    Dietz, P.F.: Maintaining order in a linked list. In: Proc. of STOCS, pp. 122–127 (1982)Google Scholar
  14. 14.
    Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and searching XML data via two zips. In: Proc. World Wide Web Conference 2006 (WWW 2006) (2006)Google Scholar
  15. 15.
    Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: Proc. of VLDB, pp. 436–445 (1997)Google Scholar
  16. 16.
    Grust, T.: Accelerating XPath location steps. In: Proc. of SIGMOD, pp. 109–120 (2002)Google Scholar
  17. 17.
    Halverson, A., Burger, J., et al.: Mixed mode XML query processing. In: Proc of VLDB, pp. 225–236 (2003)Google Scholar
  18. 18.
    Kaushik, R., Bohannon, P., et al.: Covering indexes for branching path queries. In: Proc. of SIGMOD, pp. 133–144 (2002)Google Scholar
  19. 19.
    Kaushik, R., Krishnamurthy, R., et al.: On the integration of structure indexes and inverted lists. In: Proc. of SIGMOD, pp. 779–790 (2004)Google Scholar
  20. 20.
    Kaushik, R., Shenoy, P., et al.: Exploiting local similarity for indexing paths in graph-structured data. In: Proc. of ICDE, pp. 129–140 (2002)Google Scholar
  21. 21.
    Kha, D.D., Yoshikawa, M., Uemura, S.: A structural numbering scheme for XML data. In: Proc. of EDBT, pp. 279–289 (2001)Google Scholar
  22. 22.
    Lee, J., Lee, Y., Kang, S., Jin, H., Lee, S., Kim, B., Song, J.: Bmq-index: Shared and incremental processing of border monitoring queries over data streams. In: The 7th International Conference on Mobile Data Management (MDM 2006) (2006)Google Scholar
  23. 23.
    Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB, pp. 361–370 (2001)Google Scholar
  24. 24.
    Liefke, H., Suciu, D.: XMill: An efficient compressor for XML data. In: Proc. of SIGMOD, pp. 153–164 (2000)Google Scholar
  25. 25.
    McHugh, J., Widom, J.: Query optimization for XML. In: Proc. of VLDB, pp. 315–326 (1999)Google Scholar
  26. 26.
    Min, J.-K., Park, M.-J., Chung, C.-W.: Xpress: A queriable compression for xml data. In: SIGMOD Conference, pp. 122–133 (2003)Google Scholar
  27. 27.
    Naughton, J.F., DeWitt, D.J., et al.: The Niagara internet query system. IEEE Data Eng. Bull. 24(2), 27–33 (2001)Google Scholar
  28. 28.
    Neumüller, M., Wilson, J.N.: Improving XML processing using adapted data structures. In: Proc. of WEBDB, pp. 63–77 (2002)Google Scholar
  29. 29.
    Schmidt, A.R., Waas, F., et al.: XMark: A benchmark for XML data management. In: Proc. of VLDB, pp. 27–32 (2002)Google Scholar
  30. 30.
    Tolani, P.M., Haritsa, J.R.: Xgrind: A query-friendly XML compressor. In: ICDE, pp. 225–234 (2002)Google Scholar
  31. 31.
    Wang, W., Jiang, H., Wang, H., Lin, X., Lu, H., Li, J.: Efficient processing of XML path queries using the disk-based F&B Index. In: Proc. of VLDB, pp. 145–156 (2005)Google Scholar
  32. 32.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD 2004: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pp. 335–346. ACM Press, New York (2004)CrossRefGoogle Scholar
  33. 33.
    Zezula, P., Amato, G., et al.: Tree Signatures for XML Querying and Navigation. In: Bellahsène, Z., Chaudhri, A.B., Rahm, E., Rys, M., Unland, R. (eds.) XSym 2003. LNCS, vol. 2824, pp. 149–163. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  34. 34.
    Zhang, C., Naughton, J.F., et al.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD, pp. 425–436 (2001)Google Scholar
  35. 35.
    Zhang, N., Ozsu, M.T., Ilyas, I.F., Aboulnaga, A., Cheriton, D.R.: Fix: Feature-based indexing technique for XML documents. Technical Report CS-2006-07, School of Computer Science, University of Waterloo (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • John Wilson
    • 1
  • Richard Gourlay
    • 1
  • Robert Japp
    • 1
  • Mathias Neumüller
    • 1
  1. 1.Department of Computer and Information SciencesUniversity of StrathclydeGlasgowUK

Personalised recommendations