Advertisement

Scalable Cleanup of Information Extraction Data Using Ontologies

  • Julian Dolby
  • James Fan
  • Achille Fokoue
  • Aditya Kalyanpur
  • Aaron Kershenbaum
  • Li Ma
  • William Murdock
  • Kavitha Srinivas
  • Christopher Welty
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4825)

Abstract

The approach of using ontology reasoning to cleanse the output of information extraction tools was first articulated in SemantiClean. A limiting factor in applying this approach has been that ontology reasoning to find inconsistencies does not scale to the size of data produced by information extraction tools. In this paper, we describe techniques to scale inconsistency detection, and illustrate the use of our techniques to produce a consistent subset of a knowledge base with several thousand inconsistencies.

References

  1. 1.
    Welty, C.A., Murdock, J.W.: Towards knowledge acquisition from information extraction. In: Proc. of the fifth International Semantic Web Conference, pp. 709–722 (2006)Google Scholar
  2. 2.
    Fokoue, A., Kershenbaum, A., Ma, L., Schonberg, E., Srinivas, K.: The summary abox: Cutting ontologies down to size. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 136–145. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Dolby, J., Fokoue, A., Kalyanpur, A., Kershenbaum, A., Ma, L., Schonberg, E., Srinivas, K.: Scalable semantic retrieval through summarization and refinement. In: AAAI 2007. Proc. of the 22nd Conf. on Artificial Intelligence (2007)Google Scholar
  4. 4.
    Sirin, E., Parsia, B.: Pellet: An owl dl reasoner. In: Description Logics (2004)Google Scholar
  5. 5.
    Kalyanpur, A.: Debugging and Repair of OWL-DL Ontologies. PhD thesis, University of Maryland (2006), https://drum.umd.edu/dspace/bitstream/1903/3820/1/umi-umd-3665.pdf
  6. 6.
    Dolby, J., Fokoue, A.: Kalyanpur, A., A.Kershenbaum, L.Ma, E.Schonberg, K.Srinivas: Technical report: Scalable semantic retrieval through summarization and refinement (2007), http://domino.research.ibm.com/comm/research_projects.nsf/pages/iaa.index.html$FILE/techReport2007.pdf
  7. 7.
    Schlobach, S.: Diagnosing terminologies. In: Proceedings of AAAI 2005, pp. 670–675 (2005)Google Scholar
  8. 8.
    Reiter, R.: A theory of diagnosis from first principles. Artificial Intelligence 32, 57–95 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Meyer, T., Lee, K., Booth, R.: Knowledge integration for description logics. In: AAAI, pp. 645–650 (2005)Google Scholar
  10. 10.
    Huang, Z., van Harmelen, F., ten Teije, A.: Reasoning with inconsistent ontologies. In: IJCAI 2005. Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, August 2005 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Julian Dolby
    • 1
  • James Fan
    • 1
  • Achille Fokoue
    • 1
  • Aditya Kalyanpur
    • 1
  • Aaron Kershenbaum
    • 1
  • Li Ma
    • 2
  • William Murdock
    • 1
  • Kavitha Srinivas
    • 1
  • Christopher Welty
    • 1
  1. 1.IBM Watson Research Center,P.O. Box 704, Yorktown Heights, NY 10598USA
  2. 2.IBM China Research Lab, Beijing 100094China

Personalised recommendations