Automatic Construction of a Semantic, Domain-Independent Knowledge Base

  • David Urbansky
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5872)

Abstract

In this paper, we want to show which difficulties arise when automatically constructing a domain-independent knowledge base from the web. We show possible applications for such a knowledge base to emphasize its importance. Current knowledge bases often use manually-built patterns for extraction and quality assurance which does not scale well. Our contribution to the community will be a technique to automatically assess extracted information to ensure high quality of the information and a method of how the knowledge base can be kept up to date. The research builds upon the existing WebKnox system for Web Knowledge Extraction which is able to extract named entities and facts from the web. This is a position paper.

Keywords

Knowledge Base Random Graph Automatic Construction Entity Extraction 19th International Joint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open Information Extraction from the Web. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 2670–2676 (2007)Google Scholar
  3. 3.
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 28–37 (2001)CrossRefGoogle Scholar
  4. 4.
    Downey, D., Etzioni, O., Soderland, S.: A Probabilistic Model of Redundancy in Information Extraction. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, pp. 1034–1041. Professional Book Center (2005)Google Scholar
  5. 5.
    Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence 165(1), 91–134 (2005)CrossRefGoogle Scholar
  6. 6.
    Kasneci, G., Ramanath, M., Suchanek, F.M., Weikum, G.: The YAGO-NAGA approach to knowledge discovery. SIGMOD Record 37(4), 41–47 (2008)CrossRefGoogle Scholar
  7. 7.
    Urbansky, D., Feldmann, M., Thom, J.A., Schill, A.: Entity Extraction from the Web withWebKnox. In: Proceedings of the Sixth Atlantic Web Intelligence Conference (to appear, 2009)Google Scholar
  8. 8.
    Urbansky, D., Thom, J.A., Feldmann, M.: WebKnox: Web Knowledge Extraction. In: Proceedings of the Thirteenth Australasian Document Computing Symposium, pp. 27–34 (2008)Google Scholar
  9. 9.
    Wang, R.C., Cohen, W.W.: Language-Independent Set Expansion of Named Entities Using the Web. In: The 2007 IEEE International Conference on Data Mining, pp. 342–350 (2007)Google Scholar
  10. 10.
    Wu, M., Marian, A.: Corroborating Answers from Multiple Web Sources. In: Proceedings of the 10th International Workshop on Web and Databases (WebDB 2007) (2007)Google Scholar
  11. 11.
    Zhao, S., Betz, J.: Corroborate and Learn Facts from the Web. In: KDD 2007: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge discovery and data mining, pp. 995–1003. ACM, New York (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • David Urbansky
    • 1
  1. 1.Department of Computer ScienceUniversity of Technology Dresden 

Personalised recommendations