Abstract
Data integration is the flexible and managed federation, analysis, and processing of data from different distributed sources. Data integration is becoming as important as data mining for exploiting the value of large and distributed data sets that are available today. Distributed processing infrastructures such as Grids can be used for data integration on geographically distributed sites. This paper presents a framework for integrating heterogeneous XML data sources distributed among the nodes of a Grid. We propose a query reformulation algorithm to combine and query XML documents through a decentralized point-to-point mediation process among the different data sources based on schema mappings. The above cited XML integration formalism is exposed as a Grid Service within the GDIS architecture. GDIS is a service-based architecture for providing data integration in Grids using a decentralized approach. The underlying model of such architecture is discussed and we show how it fits the XMAP formalism/algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Antonioletti, M., et al.: OGSA-DAI: Two years on. In: Global Grid Forum 10 — Data Area Workshop (2004)
Alpdemir, M.N., Mukherjee, A., Gounaris, A., Paton, N.W., Watson, P., Fernandes, A.A.A., Fitzgerald, D.J.: OGSA-DQP: A service for distributed querying on the grid. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 858–861. Springer, Heidelberg (2004)
Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R., Vetere, G.: Hyper: A framework for peer-to-peer data integration on grids. In: ICSNW, pp. 144–157 (2004)
Brezany, P., Woehrer, A., Tjoa, A.M.: Novel mediator architectures for grid information systems. FGCS - Grid Computing: Theory, Methods and Applications 21, 107–114 (2005)
Foster, I., Kesselman, C., Nick, J.M., Tuecke, S.: The physiology of the grid: An open grid services architecture for distributed systems integration. Open Grid Service Infrastructure WG, Global Grid Forum (2002), http://www.globus.org/research/papers/ogsa.pdf
Sandholm, T., Gawor, J.: Globus toolkit 3 core — A grid service container framework. Globus Toolkit Core White Paper (2003), http://www-unix.globus.org/toolkit/3.0/ogsa/docs/gt3_core.pdf
Sheth, A.P., Larson, J.A.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 22, 183–236 (1990)
Lenzerini, M.: Data integration: A theoretical perspective. In: PODS, pp. 233–246 (2002)
Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: VLDB, pp. 251–262 (1996)
Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data management for peer-to-peer computing: A vision. In: WebDB, pp. 89–94 (2002)
Calvanese, D., Damaggio, E., Giacomo, G.D., Lenzerini, M., Rosati, R.: Semantic data integration in P2P systems. In: DBISP2P, pp. 77–90 (2003)
Franconi, E., Kuper, G.M., Lopatenko, A., Serafini, L.: A robust logical and computational characterisation of peer-to-peer database systems. In: DBISP2P, pp. 64–76 (2003)
Halevy, A.Y., Suciu, D., Tatarinov, I., Ives, Z.G.: Schema mediation in peer data management systems. In: ICDE, pp. 505–516 (2003)
Clark, J., DeRose, S.: XML path language (XPath) version 1.0. W3C Recommendation (1999), http://www.w3.org/TR/xpath
Comito, C., Talia, D.: GDIS: A service-based architecture for data integration on grids. In: GADA, pp. 88–98 (2004)
Foster, I., Tuecke, S., Unger, J.: OGSA data services. DAIS-WG Informational Draft, 9th Global Grid Forum (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Comito, C., Talia, D. (2006). XML Data Integration in OGSA Grids. In: Pierson, JM. (eds) Data Management in Grids. DMG 2005. Lecture Notes in Computer Science, vol 3836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611950_2
Download citation
DOI: https://doi.org/10.1007/11611950_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31212-3
Online ISBN: 978-3-540-32452-2
eBook Packages: Computer ScienceComputer Science (R0)