Advertisement

Extracting Novel Facts from Tables for Knowledge Graph Completion

  • Benno KruitEmail author
  • Peter Boncz
  • Jacopo Urbani
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11778)

Abstract

We propose a new end-to-end method for extending a Knowledge Graph (KG) from tables. Existing techniques tend to interpret tables by focusing on information that is already in the KG, and therefore tend to extract many redundant facts. Our method aims to find more novel facts. We introduce a new technique for table interpretation based on a scalable graphical model using entity similarities. Our method further disambiguates cell values using KG embeddings as additional ranking method. Other distinctive features are the lack of assumptions about the underlying KG and the enabling of a fine-grained tuning of the precision/recall trade-off of extracted facts. Our experiments show that our approach has a higher recall during the interpretation process than the state-of-the-art, and is more resistant against the bias observed in extracting mostly redundant facts since it produces more novel extractions.

References

  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-76298-0_52CrossRefGoogle Scholar
  2. 2.
    Bhagavatula, C.S., Noraset, T., Downey, D.: TabEL: entity linking in web tables. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 425–441. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25007-6_25CrossRefGoogle Scholar
  3. 3.
    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of NIPS, pp. 2787–2795 (2013)Google Scholar
  4. 4.
    Cafarella, M., et al.: Ten years of webtables. Proc. VLDB 11(12), 2140–2149 (2018)CrossRefGoogle Scholar
  5. 5.
    Cannaviccio, M., Barbosa, D., Merialdo, P.: Towards annotating relational data on the web with language models. In: Proceedings of WWW, pp. 1307–1316 (2018)Google Scholar
  6. 6.
    Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of KDD, pp. 601–610 (2014)Google Scholar
  7. 7.
    Efthymiou, V., Hassanzadeh, O., Rodriguez-Muro, M., Christophides, V.: Matching web tables with knowledge base entities: from entity lookups to entity embeddings. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 260–277. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68288-4_16CrossRefGoogle Scholar
  8. 8.
    Efthymiou, V., Hassanzadeh, O., Sadoghi, M., Rodriguez-Muro, M.: Annotating web tables through ontology matching. In: Proceedings of OM at ISWC, pp. 229–230 (2016)Google Scholar
  9. 9.
    Ermilov, I., Ngomo, A.-C.N.: TAIPAN: automatic property mapping for tabular data. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 163–179. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49004-5_11CrossRefGoogle Scholar
  10. 10.
    Hassanzadeh, O., Ward, M.J., Rodriguez-Muro, M., Srinivas, K.: Understanding a large corpus of web tables through matching with knowledge bases: an empirical study. In: Proceedings of OM at ISWC, pp. 25–34 (2015)Google Scholar
  11. 11.
    Hayes, P.: RDF Semantics. W3C Recommendation (2004). http://www.w3.org/TR/rdf-mt/
  12. 12.
    Ibrahim, Y., Riedewald, M., Weikum, G.: Making sense of entities and quantities in web tables. In: Proceedings of CIKM, pp. 1703–1712 (2016)Google Scholar
  13. 13.
    Ji, H., Grishman, R.: Knowledge base population: successful approaches and challenges. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1148–1158. Association for Computational Linguistics (2011)Google Scholar
  14. 14.
    Kruit, B., Boncz, P., Urbani, J.: Extracting new knowledge from web tables: novelty or confidence? In: Proceedings of KBCOM (2018)Google Scholar
  15. 15.
    Kruit, B., Boncz, P., Urbani, J.: Extracting novel facts from tables for knowledge graph completion (extended version). arXiv e-prints arXiv:1907.00083 (2019)
  16. 16.
    Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. PVLDB 3(1–2), 1338–1347 (2010)Google Scholar
  17. 17.
    Mulwad, V., Finin, T., Joshi, A.: Semantic message passing for generating linked data from tables. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 363–378. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-41335-3_23CrossRefGoogle Scholar
  18. 18.
    Muñoz, E., Hogan, A., Mileo, A.: Using linked data to mine RDF from Wikipedia’s tables. In: Proceedings of WSDM, pp. 533–542 (2014)Google Scholar
  19. 19.
    Neumaier, S., Umbrich, J., Parreira, J.X., Polleres, A.: Multi-level semantic labelling of numerical values. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 428–445. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46523-4_26CrossRefGoogle Scholar
  20. 20.
    Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2016)CrossRefGoogle Scholar
  21. 21.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems - Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., Burlington (1989)zbMATHGoogle Scholar
  22. 22.
    Pham, M., Alse, S., Knoblock, C.A., Szekely, P.: Semantic labeling: a domain-independent approach. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 446–462. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46523-4_27CrossRefGoogle Scholar
  23. 23.
    Ran, C., Shen, W., Wang, J., Zhu, X.: Domain-specific knowledge base enrichment using Wikipedia tables. In: Proceedings of ICDM, pp. 349–358 (2015)Google Scholar
  24. 24.
    Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of HLT-NAACL (2013)Google Scholar
  25. 25.
    Ritze, D., Lehmberg, O., Bizer, C.: Matching HTML tables to DBpedia. In: Proceedings of WIMS, p. 10 (2015)Google Scholar
  26. 26.
    Ritze, D., Lehmberg, O., Oulabi, Y., Bizer, C.: Profiling the potential of web tables for augmenting cross-domain knowledge bases. In: Proceedings of WWW, pp. 251–261 (2016)Google Scholar
  27. 27.
    Sekhavat, Y.A., Paolo, F.D., Barbosa, D., Merialdo, P.: Knowledge base augmentation using tabular data. In: Proceedings of LDOW at WWW (2014)Google Scholar
  28. 28.
    Sun, H., Ma, H., He, X., Yih, W.T., Su, Y., Yan, X.: Table cell search for question answering. In: Proceedings of WWW, pp. 771–782 (2016)Google Scholar
  29. 29.
    Venetis, P., et al.: Recovering semantics of tables on the web. PVLDB 4, 528–538 (2011)Google Scholar
  30. 30.
    Wang, J., Wang, H., Wang, Z., Zhu, K.Q.: Understanding tables on the web. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 141–155. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-34002-4_11CrossRefGoogle Scholar
  31. 31.
    Yakout, M., Ganjam, K., Chakrabarti, K., Chaudhuri, S.: InfoGather: entity augmentation and attribute discovery by holistic matching with web tables. In: Proceedings of SIGMOD, pp. 97–108 (2012)Google Scholar
  32. 32.
    Zhang, Z.: Effective and efficient semantic table interpretation using TableMiner+. Semant. Web 8(6), 921–957 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Centrum Wiskunde & InformaticaAmsterdamThe Netherlands
  2. 2.Department of Computer ScienceVrije Universiteit AmsterdamAmsterdamThe Netherlands

Personalised recommendations