On the Difficulty of Finding Optimal Relational Decompositions for XML Workloads: A Complexity Theoretic Perspective

  • Rajasekar Krishnamurthy
  • Venkatesan T. Chakaravarthy
  • Jeffrey F. Naughton
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2572)

Abstract

A key problem that arises in the context of storing XML documents in relational databases is that of finding an optimal relational decomposition for a given set of XML documents and a given set of XML queries over those documents. While there have been a number of ad hoc solutions proposed for this problem, to our knowledge this paper represents a first step toward formalizing the problem and studying its complexity. It turns out that to even define what one means by an optimal decomposition, one first needs to specify an algorithm to translate XML queries to relational queries, and a cost model to evaluate the quality of the resulting relational queries. By examining an interesting problem embedded in choosing a relational decomposition, we show that choices of different translation algorithms and cost models result in very different complexities for the resulting optimization problems. Our results suggest that, contrary to the trend in previous work, the eventual development of practical algorithms for finding relational decompositions for XML workloads will require judicious choices of cost models and translation algorithms, rather than an exclusive focus on the decomposition problem in isolation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P. Alimonti and V. Kann. Hardness of approximating problems on cubic graphs. In Proc. 3rd Italian Conf. on Algorithms and Complexity, Lecture Notes in Computer Science, 1203, pages 288–298. Springer-Verlag, 1997.Google Scholar
  2. 2.
    R. Bar-Yehuda and S. Even. A local-ratio theorem for approximating the weighted vertex cover problem. Annals of Discrete Mathematics, 25:27–46, 1985.MathSciNetGoogle Scholar
  3. 3.
    P. Bohannon, J. Freire, P. Roy, and J. Simeon. From xml schema to relations: A cost-based approach to xml storage. In ICDE, 2002.Google Scholar
  4. 4.
    A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with stored. In SIGMOD, pages 431–442, 1999.Google Scholar
  5. 5.
    I. Dinur and S. Safra. The importance of being biased. In Proceedings of the thiryfourth annual ACM symposium on Theory of computing, pages 33–42. ACM Press, 2002.Google Scholar
  6. 6.
    D. Florescu and D. Kossman. Storing and querying xml data using an rdbms. In Data Engineering Bulletin, volume 22, 1999.Google Scholar
  7. 7.
    M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, 1979.Google Scholar
  8. 8.
    C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.Google Scholar
  9. 9.
    Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. Journal of the ACM (JACM), 27(4):633–655, 1980.MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    A. Schmidt, M. Kersten, M. Windhouwer, and F. Waas. Efficient relational storage and retrieval of xml documents. Lecture Notes in Computer Science, 1997, 2001.Google Scholar
  11. 11.
    A. R. Schmidt, F. Waas, M. L. Kersten, D. Florescu, I. Manolescu, M. J. Carey, and R. Busse. The XML Benchmark Project. Technical Report INS-R0103, CWI, Amsterdam, The Netherlands, April 2001.Google Scholar
  12. 12.
    J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton. Relational databases for querying xml documents: Limitations and opportunities. In Proceedings of the VLDB Conference, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Rajasekar Krishnamurthy
    • 1
  • Venkatesan T. Chakaravarthy
    • 1
  • Jeffrey F. Naughton
    • 1
  1. 1.University of WisconsinMadisonUSA

Personalised recommendations