Skip to main content

Automatic Generation of Taxonomies from the WWW

  • Conference paper
Practical Aspects of Knowledge Management (PAKM 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3336))

Included in the following conference series:

Abstract

In this paper we present a methodology to extract information from the Web to build a taxonomy of terms and Web resources for a given domain. This taxonomy represents a hierarchy of classes and gives to the user a general view of the kind of concepts and the most significant sites that he can find on the Web for the specified domain. The system uses intensively a publicly available search engine, extracts concepts (based on its relation to the initial one and statistical data about appearance), selects and categorizes the most representative Web resources of each one and represents the result in a standard way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. WordNet: a lexical database for English Language, http://www.cogsci.princeton.edu/wn

  2. Ansa, O., Hovy, E., Aguirre, E., Martínez, D.: Enriching very large ontologies using the WWW. In: Proceedings of the Workshop on Ontology Construction of the European Conference of AI, ECAI 2000 (2000)

    Google Scholar 

  3. Alani, H., Kim, S., Millard, D., Eal, M., Hall, W., Lewis, H., Shadbolt, N.: Automatic Ontology- Based Knowledge Extraction from Web Documents. IEEE Intelligent Systems, pp. 14–21. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  4. Aldea, A., Bañares-Alcántara, R., Bocio, J., Gramajo, J., Isern, D., Jiménez, J., Kokossis, A., Moreno, A., Riaño, D.: An ontology-based knowledge management platform. In: Workshop on Information Integration on the Web (IIWEB 2003) at IJCAI 2003, pp. 177–182 (2003)

    Google Scholar 

  5. Alfonseca, E., Manandhar, S.: An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings of the 1st International Conference on General WordNet (2002)

    Google Scholar 

  6. Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce, vol. 2. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  7. Manzano-Macho, D., Gómez-Pérez, A.: A survey of ontology learning methods and techniques. OntoWeb: Ontology-based Information Exchange Management (2000)

    Google Scholar 

  8. Maedche, A., Volz, R., Kietz, J.U.: A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet. In: EKAW 2000 Workshop on Ontologies and Texts (2000)

    Google Scholar 

  9. Lin, C.Y., Hovy, E.H.: The Automated Acquisition of Topic Signatures for Text Summarization. In: Proceedings of the COLING Conference (2000)

    Google Scholar 

  10. Maedche, A.: Ontology Learning for the Semantic web, vol. 665. Kluwer Academic Publishers, Dordrecht (2001)

    Google Scholar 

  11. Velardi, P., Navigli, R.: Ontology Learning and Its Application to Automated Terminology Translation. IEEE Intelligent Systems, 22–31 (2003)

    Google Scholar 

  12. Sheth, A.: Ontology-driven information search, integration and analysis. Net Object Days and MATES (2003)

    Google Scholar 

  13. Magnin, L., Snoussi, H., Nie, J.: Toward an Ontology.based Web Extraction. In: The Fifteenth Canadian Conference on Artificial Intelligence (2002)

    Google Scholar 

  14. OWL. Web Ontology Language. W3C. Web, http://www.w3c.org/TR/owl-features/

  15. Protégé 2.1. Web site: http://protege.stanford.edu/

  16. Voosen, P.: Extending, trimming and fusing WorNet for technical documents. In: Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources, Pittsburgh (2001)

    Google Scholar 

  17. Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development and Information Retrieval, Berkeley, USA (1999)

    Google Scholar 

  18. Hwang, C.H.: Incompletely and Imprecisely Speaking: Using Dynamic Ontologies for Representing and Retrieving Information. In: Proceedings of the 6th International Workshop on Knowledge Representation meets Databases, Sweden (1999)

    Google Scholar 

  19. Grefenstette, G.: SQLET: Short Query Linguistic Expansion Techniques: Palliating One- Word Queries by Providing Intermediate Structure to Text. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS (LNAI), vol. 1299, pp. 97–114. Springer, Heidelberg (1997)

    Google Scholar 

  20. Fernández-López, M., Gómez-Pérez, A., Juristo, N.: METHONTOLOGY: From Ontological Art Towards Ontological Engineering. In: Spring Symposium on Ontological Engineering of AAAI. Standford University, USA (1997)

    Google Scholar 

  21. Semantic Web. W3C: http://www.w3.org/2001/sw/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sánchez, D., Moreno, A. (2004). Automatic Generation of Taxonomies from the WWW. In: Karagiannis, D., Reimer, U. (eds) Practical Aspects of Knowledge Management. PAKM 2004. Lecture Notes in Computer Science(), vol 3336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30545-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30545-3_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24088-4

  • Online ISBN: 978-3-540-30545-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics