Skip to main content

Proximity Search of XML Data Using Ontology and XPath Edit Similarity

  • Conference paper
Database and Expert Systems Applications (DEXA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4653))

Included in the following conference series:

Abstract

XML data is explosively increasing, and a large amount of XML data, in which similar contents are described using different tag names and structures, have been emerging as a consequence. In such a situation, one cannot write a query against such XML data unless he/she knows the structure of the data. In this research, we propose a scheme to cope with this problem. Specifically, we expand XPath queries by replacing tag names with similar ones with the help of ontologies. In addition, we try to realize (structural) proximity matching of path expressions using edit similarity, which is a similarity measure based on edit distance. We also discuss application of SSJoin, which is an operator to support similarity joins in relational database systems, for speeding up the proposed scheme. We finally show the effectiveness of the proposed method by a series of experimentations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W3C: Extensible Markup Language (XML) 1.0, 3rd edn., Recommendation (April 2004), http://www.w3.org/TR/xml/

  2. W3C: XML Path Language (XPath) Version 1.0. Recommendation (November 1999), http://www.w3.org/TR/xpath.html

  3. W3C: XSL Transformations (XSLT) Version 1.0. Recommendation (November 1999), http://www.w3.org/TR/xslt

  4. W3C: XQuery 1.0: An XML Query Language. Recommendation (January 2007), http://www.w3.org/TR/xquery/

  5. Chaudhuri, S., Ganti, V., Kaushik, R.: A primitive operator for similarity joins in data cleaning. In: Proc. ICDE 2006, p. 5 (2006)

    Google Scholar 

  6. Cohen, W.W.: Data integration using similarity joins and a word-based information representation language. ACM Transactions on Information Systems (TOIS) 18(3), 288–321 (2000)

    Article  Google Scholar 

  7. Liang, W., Yokota, H.: A path-sequence based discrimination for subtree matching in approximate XML joins. In: Proc. The 2nd Int’l Special Workshop on Databases for Next-Generation Researchers (SWOD), p. 116 (2006)

    Google Scholar 

  8. Amer-Yahia, S., Cho, S., Srivastava, D.: Tree pattern relaxation. In: Jensen, C.S., Jeffery, K.G., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 496–513. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Zhang, K., Shasha, D.: 11. In: Tree pattern matching. Pattern Matching Algorithms, Oxford University Press, Oxford (1997)

    Google Scholar 

  10. WordNet a lexical database for the English language, http://wordnet.princeton.edu/

  11. The Gene Ontology project, http://www.geneontology.org/

  12. RDF/OWL Representation of WordNet (2006), http://www.w3.org/,/03/wn/wn20/

  13. W3C: Resource Description Framework (RDF): Concepts and Abstract Syntax (February 2004) Recommendation (2004), http://www.w3.org/TR/,/REC-rdf-concepts-20040210/

  14. W3C: SPARQL Query Language for RDF, Working Draft (October 2006), http://www.w3.org/TR/rdf-sparql-query/

  15. Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: Looking Forward. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 109–127. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. XBench – A Family of Benchmarks for XML DBMSs, http://se.uwaterloo.ca/~ddbms/projects/xbench/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Wagner Norman Revell Günther Pernul

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amagasa, T., Wen, L., Kitagawa, H. (2007). Proximity Search of XML Data Using Ontology and XPath Edit Similarity. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74469-6_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74467-2

  • Online ISBN: 978-3-540-74469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics