Abstract
In this paper we present a methodology to extract information from the Web to build a taxonomy of terms and Web resources for a given domain. This taxonomy represents a hierarchy of classes and gives to the user a general view of the kind of concepts and the most significant sites that he can find on the Web for the specified domain. The system uses intensively a publicly available search engine, extracts concepts (based on its relation to the initial one and statistical data about appearance), selects and categorizes the most representative Web resources of each one and represents the result in a standard way.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
WordNet: a lexical database for English Language, http://www.cogsci.princeton.edu/wn
Ansa, O., Hovy, E., Aguirre, E., Martínez, D.: Enriching very large ontologies using the WWW. In: Proceedings of the Workshop on Ontology Construction of the European Conference of AI, ECAI 2000 (2000)
Alani, H., Kim, S., Millard, D., Eal, M., Hall, W., Lewis, H., Shadbolt, N.: Automatic Ontology- Based Knowledge Extraction from Web Documents. IEEE Intelligent Systems, pp. 14–21. IEEE Computer Society, Los Alamitos (2003)
Aldea, A., Bañares-Alcántara, R., Bocio, J., Gramajo, J., Isern, D., Jiménez, J., Kokossis, A., Moreno, A., Riaño, D.: An ontology-based knowledge management platform. In: Workshop on Information Integration on the Web (IIWEB 2003) at IJCAI 2003, pp. 177–182 (2003)
Alfonseca, E., Manandhar, S.: An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings of the 1st International Conference on General WordNet (2002)
Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce, vol. 2. Springer, Heidelberg (2001)
Manzano-Macho, D., Gómez-Pérez, A.: A survey of ontology learning methods and techniques. OntoWeb: Ontology-based Information Exchange Management (2000)
Maedche, A., Volz, R., Kietz, J.U.: A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet. In: EKAW 2000 Workshop on Ontologies and Texts (2000)
Lin, C.Y., Hovy, E.H.: The Automated Acquisition of Topic Signatures for Text Summarization. In: Proceedings of the COLING Conference (2000)
Maedche, A.: Ontology Learning for the Semantic web, vol. 665. Kluwer Academic Publishers, Dordrecht (2001)
Velardi, P., Navigli, R.: Ontology Learning and Its Application to Automated Terminology Translation. IEEE Intelligent Systems, 22–31 (2003)
Sheth, A.: Ontology-driven information search, integration and analysis. Net Object Days and MATES (2003)
Magnin, L., Snoussi, H., Nie, J.: Toward an Ontology.based Web Extraction. In: The Fifteenth Canadian Conference on Artificial Intelligence (2002)
OWL. Web Ontology Language. W3C. Web, http://www.w3c.org/TR/owl-features/
Protégé 2.1. Web site: http://protege.stanford.edu/
Voosen, P.: Extending, trimming and fusing WorNet for technical documents. In: Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources, Pittsburgh (2001)
Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development and Information Retrieval, Berkeley, USA (1999)
Hwang, C.H.: Incompletely and Imprecisely Speaking: Using Dynamic Ontologies for Representing and Retrieving Information. In: Proceedings of the 6th International Workshop on Knowledge Representation meets Databases, Sweden (1999)
Grefenstette, G.: SQLET: Short Query Linguistic Expansion Techniques: Palliating One- Word Queries by Providing Intermediate Structure to Text. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS (LNAI), vol. 1299, pp. 97–114. Springer, Heidelberg (1997)
Fernández-López, M., Gómez-Pérez, A., Juristo, N.: METHONTOLOGY: From Ontological Art Towards Ontological Engineering. In: Spring Symposium on Ontological Engineering of AAAI. Standford University, USA (1997)
Semantic Web. W3C: http://www.w3.org/2001/sw/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sánchez, D., Moreno, A. (2004). Automatic Generation of Taxonomies from the WWW. In: Karagiannis, D., Reimer, U. (eds) Practical Aspects of Knowledge Management. PAKM 2004. Lecture Notes in Computer Science(), vol 3336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30545-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-30545-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24088-4
Online ISBN: 978-3-540-30545-3
eBook Packages: Springer Book Archive