Advancing Topic Ontology Learning through Term Extraction

  • Blaž Fortuna
  • Nada Lavrač
  • Paola Velardi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5351)

Abstract

This paper presents a novel methodology for topic ontology learning from text documents. The proposed methodology, named OntoTermExtraction (Term Extraction for Ontology learning), is based on OntoGen, a semi-automated tool for topic ontology construction, upgraded by using an advanced terminology extraction tool in an iterative, semi-automated ontology construction process. This process consists of (a) document clustering to find the nodes in the topic ontology, (b) term extraction from document clusters, (c) populating the term vocabulary and keyword extraction, and (d) choosing the concept names by comparing the best-ranked terms with the extracted keywords. The approach was successfully used for generating the ontology of topics in Inductive Logic Programming, learned semi-automatically from papers indexed in the ILPnet2 publications database.

Keywords

Topic ontology ontology construction term extraction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fortuna, B., Mladenić, D., Grobelnik, M.: Semi-automatic construction of topic ontologies. In: Ackermann, M., et al. (eds.) EWMF 2005 and KDO 2005. LNCS (LNAI), vol. 4289, pp. 121–131. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Fortuna, B., Grobelnik, M., Mladenić, D.: Semi-automatic data-driven ontology construction system. In: Proceedings of the 9th International Multi-conference Information Society, Ljubljana, Slovenia, pp. 223–226 (2006)Google Scholar
  3. 3.
    The Protégé project (2000), http://protege.stanford.edu
  4. 4.
    ILPNet2 publications database, http://www.cs.bris.ac.uk/~ILPnet2/
  5. 5.
    Sabo, S., Grčar, M., Fabjan, D.A., Ljubič, P., Lavrač, N.: Exploratory analysis of the ILPnet2 social network. In: Proceedings of the 10th International Multi-conference Information Society, Ljubljana, Slovenia, pp. 223–227 (2007)Google Scholar
  6. 6.
    Grobelnik, M., Mladenić, D.: Simple classification into large topic ontology of web documents. In: Proceedings of the 27th International Conference Information Technology Interfaces, Dubrovnik, Croatia, pp. 188–193 (2005)Google Scholar
  7. 7.
    The TermExtractor tool, http://lcl2.uniroma1.it/termextractor
  8. 8.
    Sclano, F., Velardi, P.: TermExtractor: A Web application to learn the common terminology of interest groups and research communities. In: Proceedings of the 9th Conference on Terminology and Artificial Intelligence, Sophia Antipolis, France (2007)Google Scholar
  9. 9.
    Mladenić, D., Grobelnik, M.: Evaluation of semi-automatic ontology generation in real-world setting. In: Proceedings of the 29th International Conference Information Technology Interfaces, Dubrovnik, Croatia, pp. 547–551 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Blaž Fortuna
    • 1
  • Nada Lavrač
    • 1
    • 2
  • Paola Velardi
    • 3
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.University of Nova GoricaGoricaSlovenia
  3. 3.Universita di Roma “La Sapienza”RomaItaly

Personalised recommendations