Scalable, Peer-Based Mediation Across XML Schemas and Ontologies
Research on the Semantic Web has focused on reasoning about data that is semantically annotated in the RDF data model, with concepts and properties specified in rich ontology languages such as OWL. However, to flourish, the Semantic Web needs to provide interoperability both between sites with different ontologies and with existing, non-RDF data and the applications operating on them. To achieve this, we are faced with two problems. First, most of the world’s data is available not in RDF but in XML; XML and the applications consuming it rely not only on the domain structure of the data, but also on its document structure. Hence, to provide interoperability between such sources, we must map between both their domain structures and their document structures. Second, data management practitioners often prefer to exchange data through local point-to-point data translations, rather than mapping to common mediated schemas or ontologies.
In this chapter, we present the Piazza system, which addresses the challenges of mediating between data sources on the Semantic Web by mapping both the domain structure and document structure. A key aspect of Piazza is its support for mapping between XML data and RDF data that is accompanied by OWL ontology definitions. Mappings in are provided at a local scale between small sets of nodes, and Piazza’s query answering algorithm is able to chain sets mappings together to obtain relevant data from across the system. We describe our experiences with the prototype Piazza system and a data sharing scenario implemented using it.
KeywordsTree Pattern Description Logic Target Schema Document Structure XPath Expression
Unable to display preview. Download preview PDF.
- Serge Abiteboul and Oliver Duschka. Complexity of answering queries using materialized views. In Proceedings of the Seventeenth ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems, June 1–3, 1998, Seattle, Washington, USA, pages 254–263, Seattle, WA, 1998.Google Scholar
- Rohit Ananthakrishna, Surajit Chaudhuri, and Venkatesh Ganti. Eliminating fuzzy duplicates in data warehouses. In VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China, 2002.Google Scholar
- Catriel Beeri, Alon Y Levy, and Marie-Christine Rousset. Rewriting queries using views in description logics. In Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, page 99Ű108, Tucson, AZ, 1997. ACM Press.Google Scholar
- Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web. Scientific merican, May 2001.Google Scholar
- Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini. Answering queries using views in description logics. In Working notes of the KRDB Workshop, 1999.Google Scholar
- Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, and Rajeev Motwani. Robust and efficient fuzzy match for online data cleaning. In SIGMOD 2003, Proceedings of the ACM SIGMOD International Conference on Management of Data, June 9–12, 2003, San Diego, California, USA, 2003.Google Scholar
- Mike Dean, Dan Connolly, Frank van Harmelen, James Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider, and Lynn Andrea Stein. OWL web ontology language 1.0 reference. Available from http://www.w3c.org/TR/2002-WD-owl-ref-20020729/, 29 July 2002. W3C Working Draft.Google Scholar
- Alin Deutsch, Mary F. Fernandez, Daniela Florescu, Alon Levy, and Dan Suciu. A query language for XML. In Proceedings of the Eighth International Word Wide Web Conference, Toronto, CA, 1999. World-Wide Web Consortium, 1999.Google Scholar
- Mary Fernandez, Weng-Chiew Tan, and Dan Suciu. SilkRoute: Trading between relations and XML. In Proceedings of the Ninth International World Wide Web Conference, Amsterdam, NL, 2000. World-Wide Web Consortium, November 1999.Google Scholar
- Alon Y. Halevy, Zachary G. Ives, Dan Suciu, and Igor Tatarinov. Schema mediation in peer data management systems. In Proceedings of the 19th International Conference on Data Engineering, March 5–8, 2003, Bangalore, India. IEEE Computer Society, March 2003.Google Scholar
- Ian Horrocks, Frank van Harmelen, and Peter Patel-Schneider. DAML+OIL. http://www.daml.org/2001/03/daml+oil-index.html, March 2001.Google Scholar
- Anastasios Kementsietsidis, Marcelo Arenas, and Renée J. Miller. Mapping data in peer-to-peer systems: Semantics and algorithmic issues. In Alon Y. Halevy, Zachary G. Ives, and AnHai Doan, editors, SIGMOD 2003, Proceedings of the ACM SIGMOD International Conference on Management of Data, June 9–12, 2003, San Diego, California, USA. ACM, June 2003.Google Scholar
- Jeff Z Pan and Ian Horrocks. Metamodeling architecture of web ontology languages. In Proc. of the 2001 International Semantic Web Working Symposium, page 131Ű149, 2001.Google Scholar
- Hanna Pasula and Stuart J. Russell. Approximate inference for first-order probabilistic languages. In IJCAI’ 01, pages 741–748, 2001.Google Scholar
- Peter Patel-Schneider and Jerome Simeon. Building the Semantic Web on XML. In International Semantic Web Conference 2002, Sardinia, Italy, June 9–12, 2002, June 2002.Google Scholar
- Michael Rys. Bringing the internet to your database: Using SQLServer 2000 and XML to build loosely-coupled systems. In Proceedings of the 17th International Conference on Data Engineering, April 2–6, 2001, Heidelberg, Germany, pages 465–472. IEEE Computer Society, 2001.Google Scholar
- Sheila Tejada, Craig A. Knoblock, and Steven Minton. Learning object identification rules for information integration. Information Systems Journal Special Issue on Data Extraction, Cleaning, and Reconciliation, December 2001.Google Scholar
- Paul Westerman. Data Warehousing: Using the Wal-Mart Model. Morgan Kaufmann Publishers, 2000.Google Scholar