Abstract
In this paper, we deal with the problem of effective search and query answering in heterogeneous web document bases containing documents in XML format of which the schemas are available. We propose a new solution for the structural approximation of the submitted queries which, in a preliminary schema matching process, is able to automatically identify the similarities between the involved schemas and to use them in the query processing phase, when a query written on a source schema is automatically rewritten in order to be compatible with the other useful XML documents. The proposed approach has been implemented in a web service and can deliver middleware rewriting services in any open-architecture XML repository system offering advanced search capabilities.
The present work is partially supported by the “Technologies and Services for Enhanced Content Delivery” FSI 2000 Project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baumgartner, R., Flesca, S., Gottlob, G.: Visual Web information extraction with Lixto. In: Proc. of the Twenty-seventh Int. Conference on Very Large Data Bases (2001)
Crescenzi, V., Mecca, G., Merialdo, P.: RoadRunner: automatic data extraction from data-intensive web sites. In: Proc. of the 2002 ACM SIGMOD Int. Conference on Management of Data (SIGMOD 2002) (2002)
Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XMLSchema. W3C Recommendation (2001)
Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (2003)
Ciaccia, P., Penzo, W.: Relevance ranking tuning for similarity queries on XML data. In: Proc. of the VLDB EEXTT Workshop (2002)
Schlieder, T., Naumann, F.: Approximate tree embedding for querying XML data. In: Proc. of ACM SIGIR Workshop on XML and Information Retrieval (2000)
Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: Proc. of ACM SIGMOD, pp. 287–298 (2002)
Zhang, K., Statman, R., Shasha, D.: On the editing distance between unordered labeled trees. Inf. Process. Lett. 42, 133–139 (1992)
Do, H., Rahm, E.: COMA – A system for flexible combination of schema matching approaches. In: Proc. of the 28th VLDB, pp. 610–621 (2002)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proc. of the 27th VLDB, pp. 49–58 (2001)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and ist Application to Schema Matching. In: Proc. of the 18th ICDE (2002)
Rishe, N., Yuan, J., Athauda, R., Chen, S., Lu, X., Ma, X., Vaschillo, A., Shaposhnikov, A., Vasilevsky, D.: Semantic Access: Semantic Interface for Querying Databases. In: Proc. of the 26th VLDB, pp. 591–594 (2000)
Papakonstantinou, Y., Vassalos, V.: Query rewriting for semistructured data. In: Proc. of the ACM SIGMOD, pp. 455–466 (1999)
Braga, D., Campi, A.: A Graphical Environment to Query XML Data with XQuery. In: Proc. of the 4th Intl. Conference on Web Information Systems Engineering (2003)
Castelli, D., Pagano, P.: A System for Building Expandable Digital Libraries. In: Proc. of the Third ACM/IEEE-CS Joint Conference on Digital Libraries (2003)
Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 477. Springer, Heidelberg (2002)
Miller, G.: WordNet: A Lexical Database for English. In: CACM, vol. 38 (1995)
Lassila, O., Swick, R.: Resource Description Framework (RDF) model and syntax specification. W3C Working Draft WD-rdf-syntax-19981008 (1998)
Do, H., Melnik, S., Rahm, E.: Comparison of schema matching evaluations. In: Proc. of the 2nd Int. Workshop on Web Databases (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mandreoli, F., Martoglia, R., Tiberio, P. (2004). Approximate Query Answering for a Heterogeneous XML Document Base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds) Web Information Systems – WISE 2004. WISE 2004. Lecture Notes in Computer Science, vol 3306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30480-7_35
Download citation
DOI: https://doi.org/10.1007/978-3-540-30480-7_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23894-2
Online ISBN: 978-3-540-30480-7
eBook Packages: Springer Book Archive