Using a Lexical Dictionary and a Folksonomy to Automatically Construct Domain Ontologies

  • Daniel Macías-Galindo
  • Wilson Wong
  • Lawrence Cavedon
  • John Thangarajah
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7106)

Abstract

We present and evaluate MKBUILD, a tool for creating domain-specific ontologies. These ontologies, which we call Modular Knowledge Bases (MKBs), contain concepts and associations imported from existing large-scale knowledge resources, in particular WordNet and Wikipedia. The combination of WordNet’s human-crafted taxonomy and Wikipedia’s semantic associations between articles produces a highly connected resource. Our MKBs are used by a conversational agent operating in a small computational environment. We constructed several domains with our technique, and then conducted an evaluation by asking human subjects to rate the domain-relevance of the concepts included in each MKB on a 3-point scale. The proposed methodology achieved precision values between 71% and 88% and recall between 37% and 95% in the evaluation, depending on how the middle-score judgements are interpreted. The results are encouraging considering the cross-domain nature of the construction process and the difficulty of representing concepts as opposed to terms.

Keywords

Semantic Relatedness Domain Concept Name Entity Recognition Common Noun Conversational Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A Nucleus for a Web of Open Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using Wordnet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a Collaboratively Created Graph Database for Structuring Human Knowledge. In: SIGMOD, pp. 1247–1250 (2008)Google Scholar
  4. 4.
    Budanitsky, A., Hirst, G.: Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics 32(1), 13–47 (2006)CrossRefMATHGoogle Scholar
  5. 5.
    Cohen, J.: Statistical power analysis for the behavioral sciences, 2nd rev. edn. Academic Press, London (1977)MATHGoogle Scholar
  6. 6.
    Fellbaum, C.: WordNet: an electronic lexical database. The MIT Press (1998)Google Scholar
  7. 7.
    Grieser, K., Baldwin, T., Bohnert, F., Sonenberg, L.: Using Ontological and Document Similarity to Estimate Museum Exhibit Relatedness. J. Computing and Cultural Heritage 3(3), 10:1–10:20 (2011)CrossRefGoogle Scholar
  8. 8.
    Gruber, T.: Ontology of Folksonomy: A Mash-up of Apples and Oranges. Semantic Web and Information Systems 3(2), 1–11 (2007)CrossRefGoogle Scholar
  9. 9.
    Hepp, M., Siorpaes, K., Bachlechner, D.: Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for Knowledge Management. Internet Computing 11, 54–65 (2007)CrossRefGoogle Scholar
  10. 10.
    Herbelot, A., Copestake, A.: Acquiring Ontological Relationships from Wikipedia Using RMRS. In: Web Content Mining with Human Language Technologies, pp. 1–10 (2006)Google Scholar
  11. 11.
    Lenat, D., Guha, R.V.: Building Large Knowledge-Based Systems; Representation and Inference in the Cyc Project. Addison-Wesley (1990)Google Scholar
  12. 12.
    Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic Keyphrase Extraction via Topic Decomposition. In: EMNLP, pp. 366–376 (2010)Google Scholar
  13. 13.
    Macias-Galindo, D., Cavedon, L., Thangarajah, J.: Building Modular Knowledge Bases for Conversational Agents. In: Knowledge Representation and Reasoning for Practical Dialogue Systems, pp. 16–23 (2011)Google Scholar
  14. 14.
    Martin, P.: Correction and Extension of WordNet 1.7. In: Ganter, B., de Moor, A., Lex, W. (eds.) ICCS 2003. LNCS, vol. 2746, pp. 160–173. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  15. 15.
    Massey, L., Wong, W.: A Cognitive-Based Approach to Identify Topics in Text Using the Web as a Knowledge Source. In: Wong, W., Liu, W., Bennamoun, M. (eds.) Ontology Learning and Knowledge Discovery Using the Web. IGI Global (2011)Google Scholar
  16. 16.
    Milne, D., Medelyan, O., Witten, I.H.: Mining Domain-specific Thesauri from Wikipedia: A case study. In: Web Intelligence, pp. 442–448 (2006)Google Scholar
  17. 17.
    Ponzetto, S.P., Strube, M.: Deriving a Large Scale Taxonomy from Wikipedia. In: The National Conference on Artificial Intelligence, vol. 22, pp. 1440–1446 (2007)Google Scholar
  18. 18.
    Ponzetto, S.P., Strube, M.: Knowledge Derived From Wikipedia For Computing Semantic Relatedness. J. AI Research 30, 181–212 (2007)MATHGoogle Scholar
  19. 19.
    Sabou, M., Wroe, C., Goble, C., Mishne, G.: Learning domain ontologies for web service descriptions: an experiment in bioinformatics. In: World Wide Web, pp. 190–198 (2005)Google Scholar
  20. 20.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Large Ontology from Wikipedia and WordNet. Web Semantics 6(3), 203–217 (2008)CrossRefGoogle Scholar
  21. 21.
    Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: TextRunner: open information extraction on the web. In: NAACL-Demos, pp. 25–26 (2007)Google Scholar
  22. 22.
    Zirn, C., Nastase, V., Strube, M.: Distinguishing between Instances and Classes in the Wikipedia Taxonomy. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 376–387. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Daniel Macías-Galindo
    • 1
  • Wilson Wong
    • 1
  • Lawrence Cavedon
    • 1
  • John Thangarajah
    • 1
  1. 1.School of Computer Science and I.T.RMIT UniversityMelbourneAustralia

Personalised recommendations