A Proposal for the Automatic Generation of Instances from Unstructured Text
An ontology is a conceptual representation of a domain resulted from a consensus within a community. One of its main applications is the integration of heterogeneous information sources available in the Web, by means of the semantic annotation of web documents. This is the cornerstone of the emerging Semantic Web. However, nowadays most of the information in the Web consists of text documents with little or no structure at all, which makes impracticable their manual annotation. This paper addresses the problem of mapping text fragments into a given ontology in order to generate ontology instances that semantically describe this kind of resources. As a result, applying this mapping we can automatically populate a Semantic Web consisting of text documents that concern with a specific ontology. We have evaluated our approach over a real-application ontology and a text collection both in the Archeology domain. Results show the effectiveness of the method as well as its usefulness.
KeywordsInformation Extraction Automatic Generation Text Document Semantic Annotation Oriented Graph
- 1.Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (2001)Google Scholar
- 3.Forno, F., Farinetti, L., Mehan, S.: Can Data Mining Techniques Ease The Semantic Tagging Burden? In: SWDB 2003, pp. 277–292 (2003)Google Scholar
- 5.Appelt, D.: Introduction to Information Extraction. AI Communications 12 (1999)Google Scholar
- 6.Maedche, A., Neumann, G., Staab, S.: Bootstrapping an Ontology based Information Extraction System. Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2001)Google Scholar
- 8.Dirección General del Patrimonio Artístico, http://www.cult.gva.es/dgpa/