Skip to main content

Automated Construction of Domain Ontology Taxonomies from Wikipedia

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6861))

Abstract

The key step for implementing the idea of the Semantic Web into a feasible system is providing a variety of domain ontologies that are constructed on demand, in an automated manner and in a very short time. In this paper we introduce an unsupervised method for constructing domain ontology taxonomies from Wikipedia. The benefit of using Wikipedia as the source is twofold: first, the Wikipedia articles are concise and have a particularly high “density”of domain knowledge; second, the articles represent a consensus of a large community, thus avoiding term disagreements and misinterpretations. The taxonomy construction algorithm, aimed at finding the subsumption relation, is based on two different techniques, which both apply linguistic parsing: analyzing the first sentence of each Wikipedia article and processing the categories associated with the article. The method has been evaluated against human judgment for two independent domains and the experimental results have proven its robustness and high precision.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banek, M., Jurić, D., Skočir, Z.: Learning semantic n-ary relations from Wikipedia. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6261, pp. 470–477. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  2. Buitelaar, P., Cimiano, P. (eds.): Ontology learning and population: bridging the gap between text and knowledge selected contributions to ontology learning and population from text. IOS Press, Amsterdam (2008)

    MATH  Google Scholar 

  3. Ciaramita, M., Gangemi, A., Ratsch, E., Šarić, J., Rojas, I.: Unsupervised learning of semantic relations for molecular biology ontologies. In: [2]

    Google Scholar 

  4. Cimiano, P.: Ontology learning and population from text: algorithms, evaluation and applications. Springer, Heidelberg (2006)

    Google Scholar 

  5. Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Art. Int. Research 24, 305–339 (2005)

    MATH  Google Scholar 

  6. Fellbaum, C. (ed.): WordNet. An electronic lexical database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  7. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proc. COLING, pp. 539–545 (1992)

    Google Scholar 

  8. Lin, D.: An information-theoretic definition of similarity. In: Pr. ICML, pp. 296–304 (1998)

    Google Scholar 

  9. Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proc. ECAI, pp. 321–325 (2000)

    Google Scholar 

  10. de Marneffe, C.-M., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proc. LREC, pp. 449–454 (2006)

    Google Scholar 

  11. Ponzetto, S.P., Strube, M.: Deriving a large-scale taxonomy from Wikipedia. In: Proc. AAAI, pp. 1440–1445 (2007)

    Google Scholar 

  12. Sánchez, D., Moreno, A.: Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl. Eng. 64 (3), 600–623 (2008)

    Article  Google Scholar 

  13. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO - A large ontology from Wikipedia and WordNet. J. Web Semantics 6(3), 203–217 (2008)

    Article  Google Scholar 

  14. Wikipedia, http://en.wikipedia.org (retrieved February 12, 2011)

    Google Scholar 

  15. Zirn, C., Nastase, V., Strube, M.: Distinguishing between instances and classes in the Wikipedia taxonomy. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 376–387. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jurić, D., Banek, M., Skočir, Z. (2011). Automated Construction of Domain Ontology Taxonomies from Wikipedia. In: Hameurlain, A., Liddle, S.W., Schewe, KD., Zhou, X. (eds) Database and Expert Systems Applications. DEXA 2011. Lecture Notes in Computer Science, vol 6861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23091-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23091-2_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23090-5

  • Online ISBN: 978-3-642-23091-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics