Abstract
Knowledge portals aim at facilitating the location, sharing and dissemination of information by sitting ontologies at the core of the system. For heterogeneous environments where content-providers are free to deliver the content in any format, mechanisms are required that extract and lift these content sources onto a common ontology model. This paper focuses on document providers where diversity stems from either the metadata vocabulary or the metadata location mechanism used. The ontology repository should be isolated from this heterogeneity. To this end, a rule-based approach is presented where rules encapsulate the specificities of each provider. The paper presents a working system where JENA, WebDAV, and QuickRules realise the knowledge portal, the resource repository and the rule component, respectively. Rules are given for PDF, WORD and OpenOffice resources.
This work was partially supported by the Spanish Science and Technology Ministry (MCYT) and European Social Funds (FEDER) under contract TIC2002-01442. It also benefits of funding from la “Consejería de Ciencia y Tecnología” de la Junta de la Comunidad de Castilla La Mancha (PCB-02-001).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
IETF WEBDAV Working Group Home Page (2002), http://www.ics.uci.edu/ejw/authoring
Dublin Core Metadata Initiative (2003), http://dublincore.org
OpenOffice.org 1.0: The Open Source Office Suite (2003), http://www.openoffice.org
Skonnard, A.: The XML Files: XML in Microsoft Office Word 2003 – MSDN Magazine (November 2003), http://msdn.microsoft.com/msdnmag/issues/03/11/XMLFiles/default.aspx
Adobe.XMP Extensible Metadata Platform (2002), http://partners.adobe.com/asn/developer/xmp/pdf/MetadataFramework.pdf
Salop, A.: XAP, a Digital Asset Metadata Architecture Utilizing XML and RDF (2001), http://www.gca.org/papers/xmleurope2001/papers/html/sid-03-9b.html
W3C: World Wide Web Consortium. Resource Description Framework (RDF)/W3C Semantic Web Activity (2003), www.w3.org/RDF
Crescenzi, V., Mecca, G., Merialdo, P.: Roadrunner: Towards automatic data extraction from large web sites. In: Proceedings of 27th International Conference on Very Large Data Bases, pp. 109–118 (2001)
Oberle, D., Spyns, P.: The knowledge portal ontoweb. STAR Lab Technical Report (2003)
Handschuh, S., Staab, S.: Authoring and annotation of web pages in cream. In: The Eleventh International World Wide Web Conference (WWW 2002), Honolulu, Hawaii, USA, pp. 462–473 (2002)
Pan, A., Raposo, J., Álvarez, M., Hidalgo, J., na, Á.V.: Semiautomatic wrapper generation for commercial web sources. In: Rolland, C., Brinkkemper, S., Saeki, M. (eds.) Engineering Information Systems in the Internet Context (EISIC), September 2002. IFIP Conference Proceedings, vol. 231, pp. 265–283. Kluwer, Dordrecht (2002)
Jena, H.P.: (2003), http://www.hpl.hp.com/semweb/jena.htm
Joseki, H.P.: (2003), http://www.joseki.org
Kahan, J., Koivunen, M.-R.: Annotea: an open RDF infrastructure for shared web annotations. In: World Wide Web, pp. 623–632 (2001)
Kegel, K.: Java XMP parserl (2003), http://www.kegel-diendesign.de/XmpUtil/
Liu, L., Pu, C., Han, W.: XWRAP: An XML-enabled wrapper construction system for web information sources. In: Internation Conference on Data Engineering (ICDE), pp. 611–621 (2000)
Liu, L., Pu, C., Han, W.: An XML-enabled data extraction toolkit for web sources. Information Systems 26(8), 563–583 (2001)
Jarrar, M., Meersman, R.: Formal Ontology Engineering in the DOGMA Approach. In: On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002, pp. 1238–1254. Springer, Heidelberg (2002)
Paton, N.W., Diaz, O.: Active Database Systems. ACM Computing Surveys 1(31), 63–103 (1999)
Yasu Technologies. Java rules engine (2003), http://www.yasutech.com/products/quickrules/index_SE.htm
Uschold, M., Grüninger, M.: Ontologies: principles, methods, and applications. Knowledge Engineering Review 11(2), 93–155 (1996)
World Wide Web Consortium (W3C). Owl web ontology language overview (2003), http://www.w3.org/TR/owl-features/
James, E.: Whitehead and Meredith Wiggins. WEBDAV: IETF Standard for Collaborative Authoring on the Web. IEEE Internet Computing 2(5), 34–40 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Iturrioz, J., Díaz, O., Anzuola, S.F. (2004). Facing Document-Provider Heterogeneity in Knowledge Portals. In: Persson, A., Stirna, J. (eds) Advanced Information Systems Engineering. CAiSE 2004. Lecture Notes in Computer Science, vol 3084. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25975-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-25975-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22151-7
Online ISBN: 978-3-540-25975-6
eBook Packages: Springer Book Archive