Advertisement

The VLDB Journal

, Volume 19, Issue 2, pp 231–256 | Cite as

Schema mapping and query translation in heterogeneous P2P XML databases

  • Angela BonifatiEmail author
  • Elaine Chang
  • Terence Ho
  • Laks V. S. Lakshmanan
  • Rachel Pottinger
  • Yongik Chung
Regular Paper

Abstract

Peers in a peer-to-peer data management system often have heterogeneous schemas and no mediated global schema. To translate queries across peers, we assume each peer provides correspondences between its schema and a small number of other peer schemas. We focus on query reformulation in the presence of heterogeneous XML schemas, including data–metadata conflicts. We develop an algorithm for inferring precise mapping rules from informal schema correspondences. We define the semantics of query answering in this setting and develop query translation algorithm. Our translation handles an expressive fragment of XQuery and works both along and against the direction of mapping rules. We describe the HePToX heterogeneous P2P XML data management system which incorporates our results. We report the results of extensive experiments on HePToX on both synthetic and real datasets. We demonstrate our system utility and scalability on different P2P distributions.

Keywords

Schema mapping XML query translation Heterogeneous Peer-to-Peer XML databases 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alexe B., Tan W.C., Velegrakis Y.: Stbenchmark: towards a benchmark for mapping systems. PVLDB 1(1), 230–244 (2008)Google Scholar
  2. 2.
    Altova XMLSpy: http://www.altova.com (2009)
  3. 3.
    Amer-Yahia, S., Cho, S., Lakshmanan, L., Srivastava, D.: Minimization of tree pattern queries. In: SIGMOD, pp. 497–508 (2001)Google Scholar
  4. 4.
    Andrews, A.J., Lakshmanan, L.V.S., Shiri, N., Subramanian, I.N.: On implementing schemaLog—a database programming language. In: CIKM, pp. 309–316 (1996)Google Scholar
  5. 5.
    Arenas M., Kantere V., Kementsietsidis A., Kiringa I., Miller R., Mylopoulos J.: The hyperion project: from data integration to data coordination. SIGMOD Rec. 32(3), 53–58 (2003)CrossRefGoogle Scholar
  6. 6.
    Arenas, M., Libkin, L.: XML data exchange: consistency and query answering. In: PODS, pp. 13–24 (2005)Google Scholar
  7. 7.
    Benedikt, M., Chan, C., Fan, W., Freire, J., Rastogi, R.: Capturing both types and constraints in data integration. In: SIGMOD, pp. 277–288 (2003)Google Scholar
  8. 8.
    Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data management for peer-to-peer computing: a vision. In: WebDB, pp. 89–94 (2002)Google Scholar
  9. 9.
    Bohannon, P., Elnahrawy, E., Fan, W., Flaster, M.: Putting context into schema matching. In: VLDB, pp. 307–318 (2006)Google Scholar
  10. 10.
    Bohannon, P., Fan, W., Flaster, M., Narayan, P.: Information preserving XML schema embedding. In: VLDB, pp. 85–96 (2005)Google Scholar
  11. 11.
    Bonifati, A., Chang, E., Ho, T., Lakshmanan, L.V.S., Pottinger, R.: HEPTOX: marrying XML and heterogeneity in your P2P databases. In: VLDB, pp. 1267–1270 (2005)Google Scholar
  12. 12.
    Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: Logical foundations of peer-to-peer data integration. In: PODS, pp. 241–251 (2004)Google Scholar
  13. 13.
    Chalupsky, H.: Ontomorph: a translation system for symbolic knowledge. In: KR, pp. 471–482 (2000)Google Scholar
  14. 14.
    Deutsch, A., Tannen, V.: Reformulation of XML queries and constraints. In: ICDT, pp. 225–241 (2003)Google Scholar
  15. 15.
  16. 16.
    Rahm E., Bernstein P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)zbMATHCrossRefGoogle Scholar
  17. 17.
    Fagin, R.: Inverting schema mappings. In: PODS, pp. 50–59 (2006)Google Scholar
  18. 18.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: second-order dependencies to the rescue. In: PODS, pp. 83–94 (2004)Google Scholar
  19. 19.
    Fuxman, A., Hernández, M.A., Howard, C.T., Miller, R.J., Papotti, P., Popa, L.: Nested mappings: schema mapping reloaded. In: VLDB (2006)Google Scholar
  20. 20.
    Halevy, A.Y., Ives, Z.G., Suciu, D., Tatarinov, I.: Schema mediation in peer data management systems. In: ICDE, pp. 505–516 (2003)Google Scholar
  21. 21.
    Halevy, A.Y., Ives, Z.G., Mork, P., Tatarinov, I.: Piazza: data management infrastructure for semantic web applications. In: WWW, pp. 556–567 (2003)Google Scholar
  22. 22.
  23. 23.
  24. 24.
    Hernández M.A., Papotti P., Tan W.C.: Data exchange with data–metadata translations. PVLDB 1(1), 260–273 (2008)Google Scholar
  25. 25.
    Hull, R., Yoshikawa, M.: ILOG: declarative creation and manipulation of object identifiers. In: VLDB, pp. 455–468 (1990)Google Scholar
  26. 26.
    Ives Z.G., Green T.J., Karvounarakis G., Taylor N.E., Tannen V., Talukdar P.P., Jacob M., Pereira F.: The orchestra collaborative data sharing system. SIGMOD Rec 37(3), 26–32 (2008)CrossRefGoogle Scholar
  27. 27.
    Kalfoglou Y., Schorlemmer M.: Ontology mapping: the state of the art. Knowl Eng Rev 18(1), 1–31 (2003)CrossRefGoogle Scholar
  28. 28.
    Kementsietsidis, A., Arenas, M., Miller, R.: Mapping data in peer-to-peer systems: semantics and algorithmic issues. In: SIGMOD, pp. 325–336 (2003)Google Scholar
  29. 29.
    Levy, A.Y., Mendelzon, A., Sagiv, Y., Srivastava, D.: Answering queries using views. In: PODS, pp. 95–104 (1995)Google Scholar
  30. 30.
    Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB, pp. 49–58 (2001)Google Scholar
  31. 31.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: ICDE, pp. 117–128 (2002)Google Scholar
  32. 32.
    Miller, R.J., Haas, L.M., Hernández, M.A.: Schema mapping as query discovery. In: VLDB, pp. 77–88 (2000)Google Scholar
  33. 33.
    Ng, W.S., Ooi, B., Tan, K., Zhou, A.: PeerDB: a P2P-based System for distributed data sharing. In: ICDE, pp. 633–644 (2003)Google Scholar
  34. 34.
    Noy, N., Musen, M.: Prompt: Algorithm and tool for automated ontology merging and alignment. In: AAAI, pp. 450–455 (2000)Google Scholar
  35. 35.
    Papakonstantinou, Y., Abiteboul, S., Garcia-Molina, H.: Object fusion in mediator systems. In: VLDB, pp. 413–424 (1996)Google Scholar
  36. 36.
  37. 37.
    Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002)Google Scholar
  38. 38.
    Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: VLDB, pp. 826–873 (2003)Google Scholar
  39. 39.
    Pottinger R., Halevy A.: MiniCon: a scalable algorithm for answering queries using views. VLDB J. 10(2–3), 182–198 (2001)zbMATHGoogle Scholar
  40. 40.
  41. 41.
    Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In: VLDB, pp. 974–985 (2002)Google Scholar
  42. 42.
  43. 43.
    Stumme, G., Maedche, A.: FCA-MERGE: bottom-up merging of ontologies. In: IJCAI, pp. 225–230 (2001)Google Scholar
  44. 44.
    Tatarinov, I., Halevy, A.: Efficient query reformulation in peer-data management systems. In: SIGMOD, pp. 539–550 (2004)Google Scholar
  45. 45.
    Ullman, J.: Principles of Database and Knowledge-Base Systems. Computer Science Press (1988)Google Scholar
  46. 46.
    Yu, C., Popa, L.: Constraint-based XML query rewriting for data integration. In: SIGMOD, pp. 371–382 (2004)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • Angela Bonifati
    • 1
    Email author
  • Elaine Chang
    • 2
  • Terence Ho
    • 2
  • Laks V. S. Lakshmanan
    • 2
  • Rachel Pottinger
    • 2
  • Yongik Chung
    • 2
  1. 1.Icar-CNRRende (CS)Italy
  2. 2.UBCVancouverCanada

Personalised recommendations