Finding ID Attributes in XML Documents

  • Denilson Barbosa
  • Alberto Mendelzon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2824)

Abstract

We consider the problem of discovering candidate ID and IDREF attributes in a schemaless XML document. We characterize the complexity of the problem, propose a heuristic algorithm for it, and discuss experimental results.

Keywords

Attribute Mapping Participation Constraint Path Expression Very Large Data Base Constant Factor Approximation Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. Morgan Kaufmann, San Francisco (1999)Google Scholar
  2. 2.
    Arenas, M., Fan, W., Libkin, L.: On verifying consistency of XML specifications. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 259–270 (2002)Google Scholar
  3. 3.
    Buneman, P., Davidson, S., Fan, W., Hara, C., Tan, W.-C.: Keys for XML. In: Proceedings of the 10th International Conference on the World Wide Web, pp. 201–210. ACM Press, New York (2001)CrossRefGoogle Scholar
  4. 4.
    Garey, M., Johnson, D.: Computers and Intractability: a Guide to the Theory of NP-Completeness. W.H. Freeman, New York (1979)MATHGoogle Scholar
  5. 5.
    Garofalakis, M.N., Gionis, A., Rastogi, R., Seshadri, S., Shim, K.: XTRACT: A system for extracting document type descriptors from XML documents. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, pp. 165–176 (2000)Google Scholar
  6. 6.
    Grahne, G., Zhu, J.: Discovering approximate keys in XML data. In: Press, A. (ed.) Proceedings of the 11th international conference on Information and knowledge management, McLean, Virginia, USA, November 4-9, pp. 453–460 (2002)Google Scholar
  7. 7.
    Mannila, H., Räihä, K.-J.: On the complexity of inferring functional dependencies. Discrete Applied Mathematics 40(2), 237–243 (1992)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Mignet, L., Barbosa, D., Veltri, P.: The XML Web: a first study. In: Proceedings of The 12th International World Wide Web Conference (2003) (to appear)Google Scholar
  9. 9.
    Papadimitriou, C.: Computational Complexity. Addison-Wesley, Reading (1995)Google Scholar
  10. 10.
    Paschos, V.T.: A survey of approximately optimal solutions to some covering and packing problems. ACM Computing Surveys 29(2), 171–209 (1997)CrossRefGoogle Scholar
  11. 11.
    Schmidt, A.R., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A Benchmark for XML Data Management. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), Hong Kong, China, August 2002, pp. 974–985 (2002)Google Scholar
  12. 12.
    Vazirani, V.: Approximation Algorithms. Springer, Heidelberg (2003)Google Scholar
  13. 13.
    Extensible markup language (XML) 1.0 - 2nd edition. W3C Recommendation, October 6 (2000), available at: http://www.w3.org/TR/2000/REC-xml-20001006
  14. 14.
    XML Schema part 1: Structures. W3C Recommendation, May 2 (2001), available at: http://www.w3.org/TR/xmlschema-1/

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Denilson Barbosa
    • 1
  • Alberto Mendelzon
    • 1
  1. 1.University of TorontoTorontoCanada

Personalised recommendations