Type Inference on Noisy RDF Data

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8218)


Type information is very valuable in knowledge bases. However, most large open knowledge bases are incomplete with respect to type information, and, at the same time, contain noisy and incorrect data. That makes classic type inference by reasoning difficult. In this paper, we propose the heuristic link-based type inference mechanism SDType, which can handle noisy and incorrect data. Instead of leveraging T-box information from the schema, SDType takes the actual use of a schema into account and thus is also robust to misused schema elements.


Type Inference Noisy Data Link-based Classification 


  1. 1.
    Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of dbpedia exploiting wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A crystallization point for the Web of Data. Web Semantics 7(3), 154–165 (2009)CrossRefGoogle Scholar
  3. 3.
    Cohen, W.W.: Fast effective rule induction. In: 12th International Conference on Machine Learning (1995)Google Scholar
  4. 4.
    Fensel, D., van Harmelen, F.: Unifying Reasoning and Search. IEEE Internet Computing 11(2), 94–95 (2007)CrossRefGoogle Scholar
  5. 5.
    Gangemi, A., Nuzzolese, A.G., Presutti, V., Draicchio, F., Musetti, A., Ciancarini, P.: Automatic typing of dbpedia entities. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 65–81. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explorations Newsletter 7(2), 3–12 (2005)CrossRefGoogle Scholar
  7. 7.
    Giovanni, A., Gangemi, A., Presutti, V., Ciancarini, P.: Type inference through the analysis of wikipedia links. In: Linked Data on the Web (LDOW) (2012)Google Scholar
  8. 8.
    Ji, Q., Gao, Z., Huang, Z.: Reasoning with noisy semantic data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 497–502. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Matuszek, C., Cabral, J., Witbrock, M., DeOliveira, J.: An introduction to the syntax and content of cyc. In: Proceedings of the 2006 AAAI Spring Symposium on Formalizing and Compiling Background Knowledge and its Applications to Knowledge Representation and Question Answering (2006)Google Scholar
  10. 10.
    Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data, pp. 13–20 (2000)Google Scholar
  11. 11.
    Oren, E., Gerke, S., Decker, S.: Simple algorithms for predicate suggestions using similarity and co-occurrence. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 160–174. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Paulheim, H.: Browsing linked open data with auto complete. In: Semantic Web Challenge (2012)Google Scholar
  13. 13.
    Paulheim, H., Fürnkranz, J.: Unsupervised Feature Generation from Linked Open Data. In: International Conference on Web Intelligence, Mining, and Semantics, WIMS 2012 (2012)Google Scholar
  14. 14.
    Paulheim, H., Pan, J.Z.: Why the semantic web should become more imprecise. In: What will the Semantic Web Look Like 10 Years from Now? (2012)Google Scholar
  15. 15.
    Pohl, A.: Classifying the wikipedia articles in the opencyc taxonomy. In: Web of Linked Entities Workshop (WoLE 2012) (2012)Google Scholar
  16. 16.
    Polleres, A., Hogan, A., Harth, A., Decker, S.: Can we ever catch up with the web? Semantic Web Journal 1(1,2), 45–52 (2010)Google Scholar
  17. 17.
    Shah, P., Schneider, D., Matuszek, C., Kahlert, R.C., Aldag, B., Baxter, D., Cabral, J., Witbrock, M.J., Curtis, J.: Automated population of cyc: Extracting information about named-entities from the web. In: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 153–158. AAAI Press (2006)Google Scholar
  18. 18.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web, WWW 2007, pp. 697–706. ACM (2007)Google Scholar
  19. 19.
    Völker, J., Niepert, M.: Statistical schema induction. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 124–138. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    W3C. RDF Semantics (2004),

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Research Group Data and Web ScienceUniversity of MannheimGermany

Personalised recommendations