Abstract
Currently there appear only few practical semantic web applications. The reason is mainly in that a large number of existed web documents contain only machine-unreadable information on which software agent can do nothing. There have been some works devoting to web document annotation manually or semi-automatically to solve this problem. This paper presents an automatic approach for web document annotation based on specific domain ontology. Because complete semantic annotation of web document is still a tough task, we simplify the problem by annotating ontology concept instances on web documents and propose an Ontology Instance Learning (OIL) method to extract instances from structure and free text of web documents. These instances of the ontology concept will be used to annotate web pages in the related domain. Our OIL method exhibits quite good performance in real life web documents as shown in our experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tim, B.L., James, H., Ora, L.: The Semantic Web-A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American, May 17 (2001)
Siegfried, H., Steffen, S., Alexander, M.: CREAM - creating relationalmetadata with a component-based, ontology-driven annotation framework. In: First International Conference on Knowledge Capture
Siegfried, H., Steffen, S., Fabio, C.: S-CREAM – semi-automatic cREAtion of metadata. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, p. 358. Springer, Heidelberg (2002)
Wang, Q., Chen, E.H., Wang, S.: Efficient incremental pattern mining from semistructured dataset. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 211–216. Springer, Heidelberg (2004)
Maedche, A., Staab, S.: Learning ontologies for the semantic web. In: Semantic Web Worshop 2001 (2001)
Asai, T., Abe, K., Kawasoe, S., Arimura, H.: Efficient substructure discovery from large semistructured data. In: SIAM SDM 2002 (April 2002)
Diana, M.: Using a text engineering framework to build an extendable and portable IE-based summarisation system. In: Proc. of the ACL Workshop on Text Summarisation (2002)
Miller, G.: WordNet: A lexical database for English. CACM 38(11), 39–41 (1995)
Farrar, S., Lewis, W.: A Common Ontology for Linguistic Concepts. In: Proceedings of the Knowledge Technologies Conference (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shu, W., Enhong, C. (2004). An Instance Learning Approach for Automatic Semantic Annotation. In: Zhang, J., He, JH., Fu, Y. (eds) Computational and Information Science. CIS 2004. Lecture Notes in Computer Science, vol 3314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30497-5_148
Download citation
DOI: https://doi.org/10.1007/978-3-540-30497-5_148
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24127-0
Online ISBN: 978-3-540-30497-5
eBook Packages: Computer ScienceComputer Science (R0)