Programming and Computer Software

, Volume 39, Issue 1, pp 34–42 | Cite as

Automatic construction and enrichment of informal ontologies: A survey

Article

Abstract

The conceptualization of knowledge required for an efficient processing of textual data is usually represented as ontologies. Depending on the knowledge domain and tasks, different types of ontologies are constructed: formal ontologies, which involve axioms and detailed relations between concepts; taxonomies, which are hierarchically organized concepts; and informal ontologies, such as Internet encyclopedias created and maintained by user communities. Manual construction of ontologies is a time-consuming and costly process requiring the participation of experts; therefore, in recent years, there have appeared many systems that automate this process in a greater or lesser degree. This paper provides an overview of methods for automatic construction and enrichment of ontologies, with the focus being placed on informal ontologies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Biemann, C., Ontology Learning from Text: A Survey of Methods, LDV Forum, 2005, vol. 20, pp. 75–93.Google Scholar
  2. 2.
    Navigli, R., Velardi, P., and Faralli, S., A Graph-Based Algorithm for Inducing Lexical Taxonomies From Scratch, Proc. of the Twenty-Second Int. Joint Conf. on Artificial Intelligence, 2011, pp. 1872–1877.Google Scholar
  3. 3.
    Karkaletsis, V., Fragkou, P., Petasis, G., and Iosif, E., Ontology Based Information Extraction from Text, Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, Paliouras, G., Spyropoulos, C., and Tsatsaronis, G., Eds., Berlin/Heidelberg: Springer, 2011, pp. 89–109.CrossRefGoogle Scholar
  4. 4.
    Unger, C. and Cimiano, P., Pythia: Compositional Meaning Construction for Ontology-Based Question Answering on the Semantic Web, in Natural Language Processing and Information Systems, Berlin/Heidelberg: Springer, 2011, pp. 153–160.CrossRefGoogle Scholar
  5. 5.
    Jimeno-Yepes, A., Berlanga-Llavori, R., and Rebholz-Schuhmann, D., Ontology Refinement for Improved Information Retrieval, Information Processing Management, 2010, vol. 46, no. 4, pp. 426–435.CrossRefGoogle Scholar
  6. 6.
    Grineva, M., Turdakov, D., and Sysoev, A., Blognoon: Exploring a Topic in the Blogosphere, Proc. of the 20th Int. Conf. Companion on World Wide Web, Hyderabad, India, 2011, pp. 213–216.Google Scholar
  7. 7.
    Miller, G.A., Wordnet: A Lexical Database for English, Commun. ACM, 1995, vol. 38, no. 11, pp. 39–41.CrossRefGoogle Scholar
  8. 8.
    Roget, P.M., Roget’s Thesaurus of English Words and Phrases, London: Longman, 1852.Google Scholar
  9. 9.
    Suchanek, F.M., Kasneci, G., and Weikum, G., Yago: A Large Ontology from Wikipedia and Wordnet, Web Semantics: Sci., Services Agents World Wide Web, 2008, vol. 6, no. 3, pp. 203–217.CrossRefGoogle Scholar
  10. 10.
    Ivannikov, V., Turdakov, D., and Nedumov, Y., Fast Text Annotation with Linked Data, Eighth Int. Conf. on Computer Science and Information Technologies, Yerevan, Armenia, 2011.Google Scholar
  11. 11.
    Milne, D. and Witten, I.H., Learning to Link with Wikipedia, Proc. of the 17th ACM Conf. on Information and Knowledge Management, 2008, pp. 509–518.Google Scholar
  12. 12.
    Mihalcea, R. and Csomai, A., Wikify!: Linking Documents to Encyclopedic Knowledge, Proc. of the 16th ACM Conf. on Information and Knowledge Management, 2007, pp. 233–242.Google Scholar
  13. 13.
    Gruber, T.R., Towards Principles for the Design of Ontologies Used for Knowledge Sharing, Int. J. Hum.-Comput. Stud., 1995, vol. 43, pp. 907–928.CrossRefGoogle Scholar
  14. 14.
    Faatz, A., Hörmann, S., Seeberg, C., and Steinmetz, R., Conceptual Enrichment of Ontologies by means of a Generic and Configurable Approach, Proc. of the ESS-LLI 2001 Workshop on Semantic Knowledge Acquisition and Categorisation, 2001.Google Scholar
  15. 15.
    Sowa, J.F., Ontology, 2003. http://www.jfsowa.com/ontology
  16. 16.
    Zhang, W., Yoshida, T., and Tang, X., Using Ontology to Improve Precision of Terminology Extraction from Documents, Expert Syst. Appl., 2009, vol. 36, no. 5, pp. 9333–9339.CrossRefGoogle Scholar
  17. 17.
    Buitelaar, P., Cimiano, P., and Magnini, B., Ontology Learning from Text: Methods, Evaluation and Applications, in Frontiers in Artificial Intelligence and Applications, IOS, 2005.Google Scholar
  18. 18.
    Drumond, L. and Girardi, R., A Survey of Ontology Learning Procedures, Proc. of the 3rd Workshop on Ontologies and Their Applications, 2008.Google Scholar
  19. 19.
    Cimiano, P., Ontology Learning and Population from Text: Algorithms, Evaluation and Applications, in Studies in Philosophy and Religion, Springer, 2006.Google Scholar
  20. 20.
    van den Heuvel, E., Taxonomy Learning: A Survey of Approaches, 2009. http://oaithesis.eur.nl/ir/repub/asset/4930/4930-Heuvel.pdf
  21. 21.
    Pazienza, M., Pennacchiotti, M., and Zanzotto, F., Terminology Extraction: An Analysis of Linguistic and Statistical Approaches, Knowledge Mining, ser.: Studies in Fuzziness and Soft Computing, Sirmakessis, S., Ed., Berlin/Heidelberg: Springer, 2005, vol. 185, pp. 255–279.Google Scholar
  22. 22.
    Kageura, K. and Umino, B., Methods of Automatic Term Recognition: A Review, Terminology, 1996, vol. 3, no. 2, pp. 259–289.CrossRefGoogle Scholar
  23. 23.
    Daille, B., Habert, B., Jacquemin, C., and Royaute’, J., Empirical Observation of Term Variations and Principles for Their Description, Terminology, 1996, vol. 3, no. 2, pp. 197–257.CrossRefGoogle Scholar
  24. 24.
    Ananiadou, S., A Methodology for Automatic Term Recognition, Proc. of the 15th Conf. on Computational Linguistics, 1994, vol. 2, Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 1034–1038.CrossRefGoogle Scholar
  25. 25.
    Nazar, R., A Statistical Approach to Term Extraction, Int. J. Engl. Stud., 2011, vol. 11, no. 2, pp. 159–182.MathSciNetGoogle Scholar
  26. 26.
    Wermter, J. and Hahn, U., You Can’t Beat Frequency (Unless You Use Linguistic Knowledge): A Qualitative Evaluation of Association Measures for Collocation and Term Extraction, Proc. of the 21st Int. Conf. on Computational Linguistics, 2006.Google Scholar
  27. 27.
    Evert, S. and Krenn, B., Methods for the Qualitative Evaluation Lexical Association Measures, Proc. of the 39th Annual Meeting on Association for Computational Linguistics, 2001, Stroudsburg, PA, USA: Association for Computational Linguistics, 2001, pp. 188–195.Google Scholar
  28. 28.
    Wermter, J. and Hahn, U., Paradigmatic Modifiability Statistics for the Extraction of Complex Multi-Word Terms, Proc. of the Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, 2005, Stroudsburg, PA, USA: Association for Computational Linguistics, 2005, pp. 843–850.Google Scholar
  29. 29.
    Frantzi, K.T. and Ananiadou, S., Extracting Nested Collocations, Proc. of the 16th Conf. on Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 1996, vol. 1, pp. 41–46.CrossRefGoogle Scholar
  30. 30.
    Church, K.W. and Hanks, P., Word Association Norms, Mutual Information, and Lexicography, Comput. Linguist., 1990, vol. 16, no. 1, pp. 22–29.Google Scholar
  31. 31.
    Manning, C.D. and Schutze, H., Foundations of Statistical Natural Language Processing, Cambridge, MA: MIT Press, 1999.MATHGoogle Scholar
  32. 32.
    Zhang, W., Yoshida, T., Ho, T.B., and Tang, X., Augmented Mutual Information for Multi-Word Extraction, Inf. Control, 2009, vol. 5, no. 2, pp. 543–554.Google Scholar
  33. 33.
    Daille, B., Approche mixte pour l’extraction de terminologie: statistique lexicale et filtres linguistiques, Ph.D. Dissertation, TALANA, Universite Paris, 1994.Google Scholar
  34. 34.
    Church, K.W. and Mercer, R.L., Introduction to the Special Issue on Computational Linguistics Using Large Corpora, Comput. Linguist., 1993, vol. 19, no. 1, pp. 1–24.Google Scholar
  35. 35.
    Jones, L.P., Gassie, E.W., Jr., and Radhakrishnan, S., Index: The Statistical Basis for an Automatic Conceptual Phrase-Indexing System, J. Am. Soc. Inf. Sci., 1990, vol. 41, no. 2, pp. 87–97.CrossRefGoogle Scholar
  36. 36.
    Hisamitsu, T. and Tsujii, J., Measuring Term Representativeness, Information Extraction in the Web Era, Pazienza, M.T., Ed., Berlin/Heidelberg: Springer, 2003, vol. 2700, pp. 45–76.CrossRefGoogle Scholar
  37. 37.
    Velardi, P., Missikoff, M., and Basili, R., Identification of Relevant Terms to Support the Construction of Domain Ontologies, Proc. of the Workshop on Human Language Technology and Knowledge Management, Stroudsburg, PA, USA: Association for Computational Linguistics, 2001, pp. 51–58.Google Scholar
  38. 38.
    Bourigault, D., Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases, Proc. of Int. Conf. on Computational Linguistics, Nantes, 1992, pp. 977–981.Google Scholar
  39. 39.
    Salton, G., Yang, C.S., and Yu, C.T., A Theory of Term Importance in Automatic Text Analysis, J. Am. Soc. Inf. Sci., 1975, vol. 26, no. 1, pp. 33–44.CrossRefGoogle Scholar
  40. 40.
    Ahrenberg, L., Term Extraction: A Review, Draft Version 091221, 2009. http://vir.liu.se/~lah/Publications/tereview-v2.pdf
  41. 41.
    Vivaldi, J. and Rodrguez, H., Using Wikipedia for Domain Terms Extraction, Proc. of the Second Workshop on the Creation, Harmonization and Application of Terminology Resources (CHAT 2012), Linkoping, Sweden: Linkoping University Electronic Press, 2012, pp. 3–10.Google Scholar
  42. 42.
    Nenadie, G., Ananiadou, S., and McNaught, J., Enhancing Automatic Term Recognition through Recognition of Variation, Proc. of the 20th Int. Conf. on Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2004.Google Scholar
  43. 43.
    Park, Y., Byrd, R.J., and Boguraev, B., Automatic Glossary Extraction: Beyond Terminology Identification, Proc. of the 19th Int. Conf. on Computational Linguistics, 2002, pp. 1–7.Google Scholar
  44. 44.
    Bol’shakova, E.I., Terminological Variance and Its Use in Automatic Text Processing, Proc. of the 11th Natl. Conf. on Artificial Intelligence with International Participation, Moscow: LENAND, 2008, vol. 2, pp. 174–182.Google Scholar
  45. 45.
    Turdakov, D.Yu., Word Sense Disambiguation Methods, Programming Comput. Software, 2010, vol. 36, no. 6, pp. 309–327.CrossRefGoogle Scholar
  46. 46.
    Slozhenikina, J.V., The Term: Real as Life (Why Term Can and Should Have Variants), Online J. Znanie. Perception. Ability, 2010, vol. 5.Google Scholar
  47. 47.
    Neshati, M., Abolhassani, H., and Rahimi, A., Taxonomy Learning Using Compound Similarity Measure, Proc. of the 2007 IEEE/WIC/ACM Int. Joint Conf. on Web Intelligence, Silicon Valley: IEEE Comput. Society, 2007, pp. 487–490.Google Scholar
  48. 48.
    Maedche, A. and Staab, S., Ontology Learning, Handbook on Ontologies, Staab, S. and Studer, R., Eds., Springer, 2004, pp. 173–190.Google Scholar
  49. 49.
    Weber, N. and Buitelaar, P., Web-Based Ontology Learning with isolde, Proc. of the Workshop on Web Content Mining with Human Language at the Int. Semantic Web Conf., 2006.Google Scholar
  50. 50.
    Pekar, V. and Staab, S., Taxonomy Learning: Factoring the Structure of a Taxonomy into a Semantic Classification Decision, Proc. of the 19th Int. Conf. on Computational Linguistics, 2002.Google Scholar
  51. 51.
    Hearst, M., Automatic Acquisition of Hyponyms from Large Text Corpora, Proc. of the 14th Int. Conf. on Computational Linguistics, 1992.Google Scholar
  52. 52.
    Kozareva, Z. and Hovy, E.H., A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web, Proc. of the 2010 Conf. on Empirical Methods in Natural Language Processing, MIT Press, 2010, pp. 1110–1118.Google Scholar
  53. 53.
    Navigli, R. and Velardi, P., Learning Word-Class Lattices for Definition and Hypernym Extraction, Proc. of the 48th Annu. Meeting of the Association for Computational Linguistics, 2010, pp. 1318–1327.Google Scholar
  54. 54.
    Edmonds, J., Optimum Branchings, J. Res. Nat. Bur. Stand., 1967, vol. 71B, pp. 233–240.MathSciNetCrossRefGoogle Scholar
  55. 55.
    Weichselbrauna, A., Wohlgenannta, G., and Scharl, A., Refining Non-Taxonomic Relation Labels with External Structured Data to Support Ontology Learning, Data Knowl. Eng., 2010, vol. 69, pp. 763–778.CrossRefGoogle Scholar
  56. 56.
    Shen, M., Liu, D.-R., and Huang, Y.-S., Extracting Semantic Relations to Enrich Domain Ontologies, J. Intell. Inf. Syst., 2012, pp. 1–13. doi 10.1007/s10844012-0210-yGoogle Scholar
  57. 57.
    Booshehri, M., Zamanifar, K., and Shariatmadari, S., A New Approach to Improve Learning Non-Taxonomic Relations from Text by Using Linked Data.Google Scholar
  58. 58.
    Kojima, K., Watabe, H., and Tsukasa, K., Existence and Application of Common Threshold of the Degree of Association, Proc. of the Forum on Information Technology, 2004.Google Scholar
  59. 59.
    Deerwester, S., Indexing by Latent Semantic Analysis, J. Am. Soc. Inf. Sci., 1990, vol. 41, pp. 391–407.CrossRefGoogle Scholar
  60. 60.
    Hindle, D., Noun Classification from Predicate-Argument Structures, Proc. of the 28th Annu. Meeting of the Association for Computational Linguistics, 1990, pp. 268–275.Google Scholar
  61. 61.
    Hagiwara, M., Ogawa, Y., and Toyama, K., PLSI Utilization for Automatic Thesaurus Construction, Proc. of the Second Int. Joint Conf. on Natural Language Processing, 2005, pp. 334–345.Google Scholar
  62. 62.
    Hofmann, T., Probabilistic Latent Semantic Indexing, Proc. of the 22nd Int. Conf. on Research and Development in Information Retrieval, 1999, pp. 50–57.Google Scholar
  63. 63.
    Hagiwara, M., Ogawa, Y., and Toyama, K., PLSI Utilization for Automatic Thesaurus Construction, Lect. Notes Comput. Sci., 2005, vol. 3651, pp. 334–345.CrossRefGoogle Scholar
  64. 64.
    Mochihashi, D. and Matsumoto, Y., Probabilistic Representation of Meanings, Inf. Process. Soc. Jpn. SIG Notes Nat. Lang., 2002, no. 4, NL-147, pp. 77–84.Google Scholar
  65. 65.
    Hagiwara, M., Ogawa, Y., and Toyama, K., Selection of Effective Contextual Information for Automatic Synonym Acquisition, Proc. of the 21st Int. Conf. on Computational Linguistics and the 44th Annu. Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2006, ser. ACL-44, pp. 353–360.Google Scholar
  66. 66.
    Briscoe, T. and Carroll, J., Robust Accurate Statistical Annotation of General Text, Proc. of the Third Int. Conf. on Language Resources and Evaluation, 2002, pp. 1499–1504.Google Scholar
  67. 67.
    Faatz, A. and Steinmetz, R., Ontology Enrichment with Texts from the WWW, Proc. of the ECML/PKDD Second Workshop on Semantic Web Mining, Helsinki, 2002.Google Scholar
  68. 68.
    Chifu, E.S. and Letia, I.A., Text-Based Ontology Enrichment using Hierarchical Self-Organizing Maps, Proc. of the Workshop on Nature Inspired Reasoning for the Semantic Web (NatuReS) at the 7th Int. Semantic Web Conf. (ISWC 2008), 2008.Google Scholar
  69. 69.
    Blomqvist, E., OntoCase-Automatic Ontology Enrichment Based on Ontology Design Patterns, Proc. of the Int. Semantic Web Conf. (ISWC-2009), 2009, pp. 65–80.Google Scholar
  70. 70.
    Valarakos, A., Paliouras, G., Karkaletsis, V., et al., Enhancing Ontological Knowledge through Ontology Population and Enrichment, Proc. of the 14th Int. Conf. on Engineering Knowledge in the Age of the Semantic Web (EKAW-2004), 2004, pp. 144–156.Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2013

Authors and Affiliations

  1. 1.Institute for System ProgrammingMoscowRussia

Personalised recommendations