Advertisement

Ontology Learning

  • Philipp CimianoEmail author
  • Alexander Mädche
  • Steffen Staab
  • Johanna Völker
Chapter
Part of the International Handbooks on Information Systems book series (INFOSYS)

Summary

Ontology learning techniques serve the purpose of supporting an ontology engineer in the task of creating and maintaining an ontology. In this chapter, we present a comprehensive and concise introduction to the field of ontology learning. We present a generic architecture for ontology learning systems and discuss its main components. In addition, we introduce the main problems and challenges addressed in the field and give an overview of the most important methods applied. We conclude with a brief discussion of advanced issues which pose interesting challenges to the state-of-the-art.

Keywords

Inductive Logic Programming Formal Concept Analysis Concept Hierarchy Ontology Engineering Pointwise Mutual Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    S. Abney. Partial parsing via finite-state cascades. In Proceedings of the ESSLLI ’96 Robust Parsing Workshop, pages 8–15, 1996.Google Scholar
  2. 2.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases (VLDB), 1994.Google Scholar
  3. 3.
    H. Alshawi. Processing dictionary definitions with phrasal pattern hierarchies. Computational Linguistics, 13(3–4):195–202, 1987. Special Issue of the Lexicon.Google Scholar
  4. 4.
    R. A. Amsler. A taxonomy for english nouns and verbs. In Proceedings of the 19th Annual Meeting of the Association for Computational Linguistics (ACL), pages 133–138, 1981.Google Scholar
  5. 5.
    N. Aussenac-Gilles, S. Despres, and S. Szulman. The TERMINAE method and platform for ontology engineering from text. In P. Buitelaar and P. Cimiano, editors, Bridging the Gap between Text and Knowledge: Selected Contributions to Ontology Learning and Population from Text, volume 167 of Frontiers in Artificial Intelligence. IOS Press, 2007.Google Scholar
  6. 6.
    R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.Google Scholar
  7. 7.
    M. Baroni and S. Bisi. Using cooccurrence statistics & the web to discover synonyms in a technical language. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), pages 1725–1728, 2004.Google Scholar
  8. 8.
    G. Bisson, C. Nédellec, and L. Ca namero. Designing clustering methods for ontology building – The Mo’K workbench. In Proceedings of the ECAI Ontology Learning Workshop, pages 13–19, 2000.Google Scholar
  9. 9.
    C. Brewster, H. Alani, S. Dasmahapatra, and Y. Wilks. Data-driven ontology evaluation. In Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, 2004. European Language Resources Association.Google Scholar
  10. 10.
    P. Buitelaar, T. Declerck, A. Frank, S. Racioppa, M. Kiesel, M. Sintek, R. Engel, M. Romanelli, D. Sonntag, B. Loos, V. Micelli, R. Porzel, and P. Cimiano. Linginfo: Design and applications of a model for the integration of linguistic information in ontologies. In Proceedings of the OntoLex06 Workshop at LREC, 2006.Google Scholar
  11. 11.
    P. Buitelaar, D. Olejnik, and M. Sintek. A Protégé plug-in for ontology extraction from text based on linguistic analysis. In Proceedings of the 1st European Semantic Web Symposium (ESWS), pages 31–44, 2004.Google Scholar
  12. 12.
    P. Buitelaar and B. Sacaleanu. Ranking and selecting synsets by domain relevance. In Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations, 2001.Google Scholar
  13. 13.
    N. Calzolari. Detecting patterns in a lexical data base. In Proceedings of the 22nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 170–173, 1984.Google Scholar
  14. 14.
    S. Cederberg and D. Widdows. Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction. In Conference on Natural Language Learning (CoNNL), pages 111–118, 2003.Google Scholar
  15. 15.
    P. Chapman, R. Kerber, J. Clinton, T. Khabaza, T. Reinartz, and R. Wirth. The CRISP-DM process model. Discussion Paper, March 1999.Google Scholar
  16. 16.
    E. Charniak and M. Berland. Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pages 57–64, 1999.Google Scholar
  17. 17.
    M. Ciaramita, A. Gangemi, E. Ratsch, J. Šarić, and I. Rojas. Unsupervised learning of semantic relations between concepts of a molecular biology ontology. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), pages 659–664, 2005.Google Scholar
  18. 18.
    P. Cimiano. Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, 2006.Google Scholar
  19. 19.
    P. Cimiano, M. Hartung, and E. Ratsch. Finding the appropriate generalization level for binary ontological relations extracted from the Genia corpus. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2006.Google Scholar
  20. 20.
    P. Cimiano, A. Hotho, and S. Staab. Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research (JAIR), 24:305–339, 2005.zbMATHGoogle Scholar
  21. 21.
    P. Cimiano, L. Schmidt-Thieme, A. Pivk, and S. Staab. Learning taxonomic relations from heterogeneous evidence. In P. Buitelaar, P. Cimiano, and B. Magnini, editors, Ontology Learning from Text: Methods, Applications and Evaluation, number 123 in Frontiers in Artificial Intelligence and Applications, pages 59–73. IOS Press, 2005.Google Scholar
  22. 22.
    P. Cimiano and J. Völker. Text2onto – A framework for ontology learning and data-driven change discovery. In E. Métais, A. Montoyo, and R. Muñoz, editors, Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB), volume 3513 of Lecture Notes in Computer Science, pages 227–238, 2005.Google Scholar
  23. 23.
    A. Copestake. An approach to building the hierarchical element of a lexical knowledge base from a machine readable dictionary. In Proceedings of the 1st International Workshop on Inheritance in Natural Language Processing, pages 19–29, 1990.Google Scholar
  24. 24.
    H. Cunningham, K. Humphreys, R.J. Gaizauskas, and Y. Wilks. GATE – A general architecture for text engineering. In Proceedings of Applied Natural Language Processing (ANLP), pages 29–30, 1997.Google Scholar
  25. 25.
    J. Curran. Ensemble methods for automatic thesaurus construction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 222–229, 2002.Google Scholar
  26. 26.
    K. Dellschaft and S. Staab. On how to perform a gold standard based evaluation of ontology learning. In Proceedings of the International Semantic Web Conference, pages 228–241, 2006.Google Scholar
  27. 27.
    E. Bozsak et al. KAON – Towards a large scale Semantic Web. In Proceedings of the Third International Conference on E-Commerce and Web Technologies (EC-Web). Springer Lecture Notes in Computer Science, 2002.Google Scholar
  28. 28.
    D. Faure and C. Nédellec. A corpus-based conceptual clustering method for verb frames and ontology. In P. Velardi, editor, Proceedings of the LREC Workshop on Adapting lexical and corpus resources to sublanguages and applications, pages 5–12, 1998.Google Scholar
  29. 29.
    C. Fellbaum. WordNet, an electronic lexical database. MIT Press, 1998.Google Scholar
  30. 30.
    J. Firth. A synopsis of linguistic theory 1930–1955. Studies in Linguistic Analysis, Philological Society, Oxford. Longman, 1957.Google Scholar
  31. 31.
    K. Frantzi and S. Ananiadou. The C-value/NC-value domain independent method for multi-word term extraction. Journal of Natural Language Processing, 6(3):145–179, 1999.CrossRefGoogle Scholar
  32. 32.
    B. Ganter and R. Wille. Formal Concept Analysis – Mathematical Foundations. Springer, 1999.Google Scholar
  33. 33.
    G. Grefenstette. SEXTANT: Exploring unexplored contexts for semantic extraction from syntactic analysis. In Meeting of the Association for Computational Linguistics, pages 324–326, 1992.Google Scholar
  34. 34.
    W. Grosso, H. Eriksson, R. Fergerson, J. Gennari, S. Tu, and M. Musen. Knowledge modelling at the millenium: The design and evolution of Protégé. In Proceedings of the 12th International Workshop on Knowledge Acquisition, Modeling and Management (KAW’99), 1999.Google Scholar
  35. 35.
    P. Haase and L. Stojanovic. Consistent evolution of owl ontologies. In A. Gomez-Perez and J. Euzenat, editors, Proceedings of the Second European Semantic Web Conference, volume 3532 of LNCS, pages 182–197, 2005.Google Scholar
  36. 36.
    P. Haase and J. Völker. Dealing with uncertainty and inconsistency. In P. C. G. da Costa, K. B. Laskey, K. J. Laskey, and M. Pool, editors, Proceedings of the Workshop on Uncertainty Reasoning for the Semantic Web (URSW), pages 45–55, 2005.Google Scholar
  37. 37.
    P. Haase and J. Völker. Ontology learning and reasoning – Dealing with uncertainty and inconsistency. In P. C. G. da Costa, K. B. Laskey, K. J. Laskey, and M. Pool, editors, Proceedings of the Workshop on Uncertainty Reasoning for the Semantic Web (URSW), pages 45–55, 2005.Google Scholar
  38. 38.
    H.-M. Haav. An application of inductive concept analysis to construction of domain-specific ontologies. In Proceedings of the VLDB Pre-conference Workshop on Emerging Database Research in East Europe, 2003.Google Scholar
  39. 39.
    Z. S. Harris. Mathematical Structures of Language. Wiley, 1968.Google Scholar
  40. 40.
    M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING), pages 539–545, 1992.Google Scholar
  41. 41.
    M. A. Hearst and H. Schütze. Customizing a lexicon to better suit a computational task. In Proceedings of the ACL SIGLEX Workshop on Acquisition of Lexical Knowledge from Text, 1993.Google Scholar
  42. 42.
    J. Hendler. On beyond ontology. Keynote Talk at the International Semantic Web Conference (ISWC), 2004.Google Scholar
  43. 43.
    G. Heyer, M Läuter, U. Quasthoff, T. Wittig, and C. Wolff. Learning relations using collocations. In Proceedings of the IJCAI Workshop on Ontology Learning, 2001.Google Scholar
  44. 44.
    J. Iria, C. Brewster, F. Ciravegna, and Y. Wilks. An incremental tri-partite approach to ontology learning. In Proceedings of the Language Resources and Evaluation Conference (LREC-06), Genoa, Italy 22–28 May, 2006.Google Scholar
  45. 45.
    P. Haase J. Völker and P. Hitzler. Learning expressive ontologies. In P. Buitelaar and P. Cimiano, editors, Bridging the Gap between Text and Knowledge: Selected Contributions to Ontology Learning and Population from Text, volume 167 of Frontiers in Artificial Intelligence. IOS Press, 2007.Google Scholar
  46. 46.
    L. Kaufman and P. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 1990.Google Scholar
  47. 47.
    J.-U. Kietz, R. Volz, and A. Mädche. Extracting a domain-specific ontology from a corporate intranet. In Proceedings of the 2nd Learning Language in Logic (LLL) Workshop, 2000.Google Scholar
  48. 48.
    N. Lavrac and S. Dzeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, 1994.Google Scholar
  49. 49.
    L. Lee. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pages 25–32, 1999.Google Scholar
  50. 50.
    D. Lin. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL), pages 768–774, 1998.Google Scholar
  51. 51.
    F. A. Lisi and F. Esposito. Two orthogonal biases for choosing the intensions of emerging concepts in ontology refinement. In G. Brewka, S. Coradeschi, A. Perini, and P. Traverso, editors, Proceedings of the 17th European Conference on Artificial Intelligence (ECAI), pages 765–766. IOS Press, 2006.Google Scholar
  52. 52.
    A. Mädche, B. Motik, and L. Stojanovic. Managing multiple and distributed ontologies in the semantic web. VLDB Journal, 12(4):286–302, 2003.CrossRefGoogle Scholar
  53. 53.
    A. Mädche and S. Staab. Discovering conceptual relations from text. In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), pages 321–325, 2000.Google Scholar
  54. 54.
    A. Mädche and S. Staab. Measuring similarity between ontologies. In Proceedings of the European Conference on Knowledge Acquisition and Management (EKAW), pages 251–263, 2002.Google Scholar
  55. 55.
    A. Mädche and R. Volz. The Text-To-Onto ontology extraction and maintenance system. In Workshop on Integrating Data Mining and Knowledge Management, collocated with the 1st International Conference on Data Mining, 2001.Google Scholar
  56. 56.
    C. Manning and H. Schütze. Foundations of Statistical Language Processing. MIT Press, 1999.Google Scholar
  57. 57.
    V. Pekar and S. Staab. Taxonomy learning: Factoring the structure of a taxonomy into a semantic classification decision. Proceedings of the 19th Conference on Computational Linguistics (COLING), 2:786–792, 2002.Google Scholar
  58. 58.
    F. Pereira, N. Tishby, and L. Lee. Distributional clustering of english words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL), pages 183–190, 1993.Google Scholar
  59. 59.
    M. Poesio, T. Ishikawa, S. Schulte im Walde, and R. Viera. Acquiring lexical knowledge for anaphora resolution. In Proceedings of the 3rd Conference on Language Resources and Evaluation (LREC), 2002.Google Scholar
  60. 60.
    M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130–137, 1980.CrossRefGoogle Scholar
  61. 61.
    P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research (JAIR), 11:95–130, 1999.zbMATHGoogle Scholar
  62. 62.
    D. Sanchez and A. Moreno. Web-scale taxonomy learning. In C. Biemann and G. Pass, editors, Proceedings of the Workshop on Extending and Learning Lexical Ontologies using Machine Learning Methods, 2005.Google Scholar
  63. 63.
    H. Schmid. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, 1994.Google Scholar
  64. 64.
    A. Schutz and P. Buitelaar. RelExt: A tool for relation extraction from text in ontology extension. In Proceedings of the International Semantic Web Conference (ISWC), pages 593–606, 2005.Google Scholar
  65. 65.
    E. Simperl, C. Tempich, and D. Vrandecic. A methodology for ontology learning. In P. Buitelaar and P. Cimiano, editors, Bridging the Gap between Text and Knowledge: Selected Contributions to Ontology Learning and Population from Text, volume 167 of Frontiers in Artificial Intelligence. IOS Press, 2007.Google Scholar
  66. 66.
    R. Snow, D. Jurafsky, and Y. Ng. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, pages 801–808, 2006.Google Scholar
  67. 67.
    M. Strube and S. Paolo Ponzetto. Wikirelate! computing semantic relatedness using wikipedia. In Proceedings of the National Conference on Artificial Intelligence (AAAI), pages 1419–1424, 2006.Google Scholar
  68. 68.
    G. Stumme, R. Taouil, Y. Bastide, N. Pasqier, and L. Lakhal. Computing iceberg concept lattices with titanic. Journal of Knowledge and Data Engineering (KDE),, 42(2):189–222, 2002.CrossRefzbMATHGoogle Scholar
  69. 69.
    P. D. Turney. Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning (ECML), pages 491–502, 2001.Google Scholar
  70. 70.
    P. Velardi, R. Navigli, A. Cuchiarelli, and F. Neri. Evaluation of OntoLearn, a methodology for automatic population of domain ontologies. In P. Buitelaar, P. Cimiano, and B. Magnini, editors, Ontology Learning from Text: Methods, Applications and Evaluation, number 123 in Frontiers in Artificial Intelligence and Applications, pages 92–106. IOS Press, 2005.Google Scholar
  71. 71.
    J. Völker, P. Hitzler, and P. Cimiano. Acquisition of OWL DL axioms from lexical resources. In Proceedings of the 4th European Semantic Web Conference (ESWC’07), pages 670–685, 2007.Google Scholar
  72. 72.
    J. Völker, D. Vrandecic, and Y. Sure. Automatic evaluation of ontologies (AEON). In Y. Gil, E. Motta, V. R. Benjamins, and M. A. Musen, editors, Proceedings of the 4th International Semantic Web Conference (ISWC), volume 3729 of LNCS, pages 716–731. Springer, 2005.Google Scholar
  73. 73.
    J. Völker, D. Vrandecic, Y. Sure, and A. Hotho. Learning disjointness. In Proceedings of the 4th European Semantic Web Conference (ESWC’07), pages 175–189, 2007.Google Scholar
  74. 74.
    J. Völker and S. Rudolph. Lexico-logical acquisition of OWL DL axioms – An integrated approach to ontology refinement. In R. Medina and S. Obiedkov, editors, Proceedings of the 6th International Conference on Formal Concept Analysis (ICFCA’08), volume 4933 of Lecture Notes in Artificial Intelligence, pages 62–77. Springer, 2008.Google Scholar
  75. 75.
    D. Widdows. Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In Proceedings of the Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics (HLT/NAACL), pages 276–283, 2003.Google Scholar
  76. 76.
    I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Philipp Cimiano
    • 1
    Email author
  • Alexander Mädche
    • 2
  • Steffen Staab
    • 3
  • Johanna Völker
    • 1
  1. 1.Institute AIFBUniversity of KarlsruheKarlsruheGermany
  2. 2.SAP AGWalldorfGermany
  3. 3.ISWEB GroupUniversity of Koblenz-LandauKoblenzGermany

Personalised recommendations